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Telomere** Proton n^^nrnt 

Background of t.h* T™~ nr1?n 

Chromosome stability is essential for cell viability. 
Eukaryotes have linear chromosomes and the telomeres that 
5 cap the ends protect chromosomes from degradation and 
recombination. Loss of telomeric DNA during cell 
proliferation may play a role in ageing and cancer. 
Counter, CM., et al. (1992) EMBO, 11:1921-1929. 
Telomeric sequences are highly conserved in 
10 eukaryotes. The DNA sequence contains simple tandem 

repeats of specific GT-rich motifs. The exact sequences 
are characteristic of a particular organism; i.e., 
d (TTGGG6) in Tetrahymena, d (TTTTGGGG) in Oxytricha and 
d(TTAGGG) in humans. The number of repeats on any given 
chromosome end varies, giving telomeres a characteristic 
heterogeneous or "fuzzy- appearance on Southern blots, m 
addition to sequence conservation, telomere function is 
also conserved in eukaryotes. retrahj/mena and human 
telomeres function in the yeast Saccharomyces cerevieiae, 
and yeast telomeres function in other fungi. Szostak, j!w. 
and E.H. Blackburn (1982) Cell, 29:245-255. Thus the' 
mechanisms for maintaining a stable end must share 
essential features in diverse eukaryotes. 

Telomere sequences are synthesized onto chromosome 
ends by a highly specialized DNA polymerase called 
telomerase. Telomerase is a ribonucleoprotein enzyme in 
which both the RNA and the protein components are essential 
for telomerase activity. The RNA component provides the 
template for the telomere repeat synthesis. Blackburn, 
30 E.H. (1992) Annu. .Rev. Biochem. , ffl:113-l29. 

Telomere replication involves the establishment of an 
equilibrium between telomere shortening and telomere 
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lengthening. DNA replication leads to telomere shortening 
because DNA- template dependent DNA polymerase cannot 
replicate the very end of a DNA molecule. Telomeraee 
elongates chromosomes through de novo sequence addition. 
5 Double- stranded synthesis by DNA- template dependent DNA 
polymerase and primers then fill in the complementary C- 
rich strand. 

The RNA component of telomerase has been sequenced for 
humans (Peng, J., et al. (1995) Science 2*9:1236-1241) , 

10 mice, and several mammalian species (Greider, C, 

unpublished data) , as well as Saccharomyces cerevisiae, 
Tetrahymena, Euplotes and Oxytricha. See Singer and 
Gottschling, (1994) Science, 2^:404-409; Lingner, et al. 
(1994) Genes & Development, 5:1984-1988; Romero, D.P. and 

15 E.H. Blackburn (1994) Cell, £7:343-353. The protein 
component of a telomerase from any species has not 
previously been sequenced or cloned. 

Summary of the Invention 

Described herein are genes encoding a telomerase 

20 protein component of eukaryotic, including mammalian, 
origin, telomerase proteins encoded by the genes, RNA 
encoding the polypeptides described, and sequences that 
hybridize to these genes. As described herein, the genomic 
sequences encoding a telomerase protein component have been 

25 determined by the Applicants. Both the RNA and the protein 
components of telomerase are essential in the maintenance 
of telomeric length in chromosomes. The protein component 
of a telomerase can be used by itself or coupled with the 
RNA component in diagnostic or therapeutic methods and in 

30 assays for telomerase. 

As described herein, a Tetrahymena gene encoding a 
Tetrahymena telomerase protein component has been isolated 
and sequenced. The polypeptides encoded by these genes 
have been shown to be an 80 kD and a 95 kD polypeptide (p80 
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and p95, respectively) . The polypeptides comprise a 
protein that, coupled with the RNA component, acts to add 
telomeric TTGGGG repeats to stabilize chromosomal telomere 
length. The present invention also provides DNA sequences 
5 and portions thereof, sequences complementary to these DNA 
sequences, and sequences, such as probes, that hybridize to 
either the sense or the complementary (antisense) sequences 
or fragments thereof that encode the polypeptides 
disclosed. 

10 in particular, an 80 kD and a 95 kD polypeptide which 

are components of Tetraliyjnena telomerase protein have been 
isolated and sequenced. The amino acid sequences of the 80 
kD and 95 kD polypeptides of the protein component are 
disclosed herein, as are the DNA (nucleic acid) sequences 
15 which encode the 80 kD and 95 kD proteins. 

Further disclosed are nucleotide sequences encoding . 
P80 and P 95 telomerase polypeptides which are translated by 
most eukaryotes. These DNA sequences have been 
incorporated into plasmids, and the plasmids transfected 
20 into vectors. Host cells comprising these vectors are 
provided for the production of recombinant telomerase 
protein component. 

Both DNA sequences and polypeptide sequences that are 
substantially equivalent to the disclosed sequences are 
25 also provided by this invention. 

Also included are methods of using the DNA sequences 
encoding the retrahyraena protein components to determine 
the DNA sequences encoding the protein components of other 
invertebrate and vertebrate species, in particular 
30 mammalian species, such as the genes for human, mouse, rat, 
dog, cat, pig, chimpanzee, or monkey telomerase protein 
component . 

The present work also makes available methods of 
determining whether a mammal, especially a human 
35 individual, is likely to be affected with a disorder or 
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disease in which abnormal telomerase activity is a symptom 
or cause. Methods of detecting telomerase expression are 
provided as a means of diagnosing a predisposition to the 
development of immortal or cancer cells in a human or in 
5 another animal. In one embodiment, DNA or RNA present in a 
cell or tissue sample is hybridized to a DNA or RNA probe 
which is complementary to all or a portion of a telomerase 
protein component gene. As used herein, the term 
"telomerase protein component gene" includes the genes 

10 whose sequence is described herein, genes which hybridize 
to the genes or portions thereof, and equivalent genes from 
other species, such as those from human, mouse, rat, dog, 
cat, pig, chimpanzee, monkey, or Tetrahymena. Detection of 
hybridization is an indication of a predisposition to the 

15 development of or the presence of cancer, or another 
disorder in which immortal cells arise. 

An important feature of this invention is that the 
telomerase protein component can be used to screen for 
telomerase inhibitors which can be used to prevent 

20 telomerase expression and/or activity in cells. The 

protein component can be used as a basis for a method to 
identify and treat individuals affected by abnormal 
telomerase activity either within their own cells and 
tissues, or in foreign cells of invading parasites or 

25 disease organisms which are eukaryotes. 

Therefore, the present invention provides a 
diagnostic tool through which inhibitors of telomerase 
activity can be tested and developed, and by which diseases 
such as cancer, or infections, such as yeast or protozoan 

30 diseases, can be diagnosed. 

Another embodiment of the present invention is 
antibodies to a telomerase protein component, such as 
antibodies to one or both of the 80 kD or 95kD 
polypeptides, synthetic telomerase polypeptide sequences, 

35 or portions of these polypeptides. These include both 
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polyclonal and monoclonal antibodies, such as polyclonal 
and monoclonal antibodies which bind either or both the 80 
JcD and 95 kD polypeptides of this invention. Such anti- 
telotnerase antibodies are useful to detect telomerase 
activity in cells and tissues. 

Further embodiments include methods of therapy and 
treatment involving recombinant and/or transgenic cells 
containing either or both of the genes for the telomerase 
subunits, by themselves or in combination with other genes, 
such as a gene encoding the telomerase RNA component. 
Recombinant or transgenic cells producing anti- telomerase 
antibodies are included as well. Such methods can be 
applied to the treatment of disorders arising from abnormal 
telomerase activity or can be used to increase or trigger 
expression of telomerase to prevent cell mortality. 

3rief Description of th* P^ irfa 

Figure 1 is the nucleotide sequence (SEQ id NO:l) of 
the Tetrahymena 80 kD protein gene. The nucleotide 
20 sequence is derived from genomic and cDNA clones. 

Figure 2 is the amino acid sequence (SEQ ID NO: 2) of 
the 80 kD protein deduced from the nucleotide sequence 
shown in Figure 1. 

Figure 3 is the nucleotide sequence (SEQ ID NO: 3) of 
25 the Tetrahymena 95 kD protein gene. 

Figure 4 is the amino acid sequence (SEQ ID NO: 4) of 
the 95 kD protein deduced from the nucleotide sequence 
shown in Figure 3. 

Figure 5 is primer set 1 consisting of 12 
30 deoxyribonucleotide sequences (primers 1-12) . 

Figure 6 is primer set 2 consisting of 12 
deoxyribonucleotide sequences (primers 13-24). 

Figure 7 is primer set 3 consisting of 10 
deoxyribonucleotide sequences (primers 25-34). 
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Figures 8A-8B show primer set 4 consisting of 18 
deoxyribonucleotide sequences (primers F1-F9 and R1-R9) . 

Figure 9 is primer set 5 consisting of 10 
deoxyribonucleotide sequences (primers F10-F14 and R10- 
5 R14) . 

Figure 10 is the DNA sequence (SEQ ID NO: 8) of the 
genetically-engineered p80 gene. 

Figure 11 is the DNA sequence (SEQ ID NO: 9) of the 
genetically-engineered p95 gene. 

10 Detailed Description of the Invention 

This invention relates to genes encoding a eukaryotic 
telomerase protein component, the polypeptides encoded by 
these genes, as well as the RNA encoding the polypeptides, 
complementary nucleotide sequences, and probes that 

15 hybridize to sense and complementary portions of the 
nucleotide sequences. 

Further provided are synthesized genes encoding 80 kD 
and 95 kD telomerase protein components, the recombinant 
polypeptides encoded by these genes, the RNA encoding the 

20 polypeptides, the primers used to synthesize these genes, 
and complementary nucleotide sequences or fragments 
thereof . 

Those skilled in the art will appreciate that many 
different DNA sequences can encode a single protein. In 

25 addition to the genes and other nucleotide sequences 

described above, contemplated within this invention are DNA 
sequences which encode catalytically active, telomerase 
protein components, and nucleotide sequences that hybridize 
to these DNA sequences. Generally, these will hybridize 

30 under moderately stringent conditions. According to the 
invention, the term "stringent conditions" means 
hybridization conditions comprising a salt concentration of 
4X SSC (NaCl-citrate buffer) at 62° -66° C. , and "high 
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stringent conditions" means hybridization conditions 
comprising a salt concentration of 0.1X SSC at 68° C. 
Ausubel, et al., (1994) Current Protocols in Molecular 
Biology, John Wiley & Sons, Inc. 

Methods of using these sequences to deduce other 
telomerase components are described. Methods of diagnosis 
and treatment which use a telomerase protein component, 
nucleotide sequences encoding the protein component or' 
portions thereof are also included. 

Following is a description of the embodiments of the 
invention, which, together with the following examples 
delineating the experimental procedures, serve to explain 
the principles of the invention. All references to 
materials and methods are herein incorporated by reference. 
15 The present invention also encompasses polypeptides 

comprising a telomerase protein component of eukaryotic 
origin, including the polypeptides herein described. All 
polypeptides which comprise a telomerase protein component 
and are active as a component of a telomerase are 
20 encompassed by the present invention and the term 
telomerase protein component as used herein. 

The telomerase protein component has been produced by 
the following method in "substantially pure" form. 
"Substantially pure- is defined as the minimum amino acid 
25 sequence that, when combined with the telomerase RNA 
component, demonstrates telomerase activity. 

Tetrahvmena Profein r^p^nfe g*n OC 

Tetrahymena telomerase enzyme was purified using 
readily available chromatography matrixes. Two criteria 
30 were used to follow enzyme purification. First, activity 
assays were performed using the standard telomerase assay 
(Greider, C.W. (1987) Cell, 51: 887-898) and 32P-dGTP 
incorporation was quantitated by spotting on DE-81 paper 
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and determining the counts incorporated (Greider, C.W. 
(1987) Ph.D. Thesis, Univ. Calif. Berkeley). Second, 
telomerase RNA was followed by Northern blot analysis and 
quantitated by comparison to a titration of a known amount 
5 of a synthetic telomerase RNA standard. Purification over 
hydroxylapatite, spermine agarose, Sepharose CL-6B sizing 
column, phenyl -Sepharose, DEAE agarose (or Q-Sepharose) and 
a 15- or 20-35% glycerol gradient, yielded highly purified 
telomerase fractions. Two predominant proteins of 80 and 

10 95 kD were identified in the active fractions which co- 
purified with telomerase activity and were present in a 
stoichiometry similar to the telomerase RNA. 

Two samples of the material purified as described 
above were separated on a non- denaturing gel . One lane of 

15 the gel was Northern blotted to identify the position of 
the telomerase RNA and the other lane was cut from the 
native gel and run in a second dimension on an SDS PAGE 
gel. Most of the proteins remained near the well of the 
first native gel; however, both the telomerase RNA and the 

20 p80 and p95 proteins ran approximately one-third of the way 
into the native gel at equivalent positions, indicating 
that p80 and p95 are components of telomerase. Beginning 
with over 300 L or 1.2x10" cells, the active fraction in 
the final glycerol gradient contained over a microgram of 

25 telomerase RNA. This indicated there was enough material 
to sequence the co-purifying polypeptides. 

To determine if the p80 and p95 fraction comprises 
telomerase activity or is a contaminant that migrates with 
the same properties, telomerase was treated with 

30 micrococcal nuclease. Previous experiments have shown that 
limited cleavage of the telomerase RNA does not completely 
inactivate telomerase activity. Greider, C.W. and E.H. 
Blackburn (1989) Nature, 337:331-337. Two fractions of 
purified telomerase were prepared for glycerol gradient 
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analysis, to determine whether cleavage of the RNA would 
alter the mobility of the RNP in a glycerol gradient. One 
sample was briefly treated with micrococcal nuclease; the 
other was incubated with buffer only. These samples were 
5 sedimented through a glycerol gradient and fractions were 
collected from each gradient. The activity was assayed and 
the protein profile determined. In the untreated fraction, 
activity peaked in fractions 8 and 9 along with p80 and 
p95. in the micrococcal nuclease-treated fraction weak 
10 activity peaked in fraction 10, the peak of p80 and p95 was 
now also shifted to fraction 10, indicating that these 
proteins behave as expected for telomerase components. The 
sedimentation of most other proteins in the gradient 
remained unchanged relative to the change in sedimentation 
15 of telomerase. 

The partial peptide sequences from both p80 and p95 
were determined. The complete amino acid sequences of the 
two polypeptides can be determined in the same manner. 
Telomerase from 344 L of Tetrahymena cells was purified 
20 according to the procedures described above with the 

addition of a DEAE agarose concentration step followed by 
non-denaturing gel electrophoresis and SDS PAGE 
electrophoresis. To avoid problems associated with direct 
N- terminal sequencing of proteins, the excised protein 
25 bands were digested with Lysylendopeptidase from 

Achromobacter. The peptide fragments were extracted from 
the gel and resolved on a C18 reverse phase HPLC column. 
Several well defined peptide peaks were subjected to 
successive rounds of Edman degradation on an Applied 
30 Biosystems automated sequencer. From two separate 

preparations of telomerase, the amino acid sequence was 
determined for 7 peptides from p80 and for 25 peptides from 
p95. Degenerate oligonucleotides were designed using the 
Tetrah>anena codon bias as a guide. Martindale, D.W. (1989) 
35 J. Protozol. 36:29-34. 
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Oligonucleotides were used in sets of two to obtain 
PCR products from either reverse transcribed RNA or from 
genomic DNA. Two PCR products were obtained for each 
protein gene. The sequence of three of the four PCR 
5 products encoded peptides which had been identified by 

protein sequencing but were not used as primers for the PCR 
(Figures 2 and 4) . Genomic Southern blots probed with the 
PCR products for either p80 or p95 proteins showed that the 
gene probably exists as a single copy in the Tetrahymena 

10 genome. Northern blot analysis from actively growing cells 
showed a single band of about 3.0 kb for p95 and a single 
band of 2.5 kb for the p80 mRNA. RNA from stationary cells 
showed two bands when probed with the 5' portion of the 95 
kD gene. This suggests alternative processing of this 

15 gene. 

To obtain the full length protein sequence, the cloned 
PCR products were used as probes for both Tetrahymena cDNA 
libraries and genomic libraries. Positive clones were 
obtained, subcloned, and sequenced. To deduce these 

20 protein sequences, the Tetrahymena genetic code was used 
since this sequence differs from that of other eukaryotes. 
Prescott, D.M. (1994) Microbiol. Rev., 58:233-267. 
Applicants have determined the sequence for the entire open 
reading frame (ORF) for both the p80 and p95 proteins 

25 (Figure 1, SEQ ID N0:1 and Figure 3, SEQ ID NO:3, 

respectively) . The nucleotide sequence is derived from 
genomic and cDNA clones; polyadenylation of the mRNA occurs 
near the 3' end of the reported sequence. First, Northern 
blot analysis of the p80 and p95 mRNAs suggests sizes of 

30 approximately 2.47 and 2.9 kb. Applicants have obtained 
more than 2,4 and 2.8 kb of sequence for these mRNAs* 
Second, all reliable peptide sequence was found in the ORFs 
(7/7 for p80; 25/25 for p95) . Third, the suggested 
translation of the mRNAs from the first methionine codon in 

35 the longest ORF yields predicted protein products of equal 
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or slightly greater molecular mass than predicted from 
analysis of the proteins by SDS-PAGE. Fourth, sequences 
outside the region translated as coding contain a higher 
content of A/T than coding regions, typical of Tetrahymena 
5 genes. Prescott, D.M. (1994) Microbiol. Rev., 55:233-267. 
Neither of the genes has a counterpart in Genbank, EKMBL, 
PIR and Swissprot databases. 

To demonstrate that the 80 kD and 95 kD proteins are 
components of telomerase, polyclonal antibodies were 
10 generated against the two proteins. Synthetic peptides 

were synthesized that corresponded to two different regions 
from each protein. Two polyclonal antibodies to peptides 
of the 80 kD protein (designated A81 and A82) and four 
antibodies to peptides of the 95 kD protein (designated 
15 A83, A84, A85, and A86) showed good titre against the 
respective proteins. 

Table 1 lists the antibodies obtained to various 
peptide sequences used.' The peptide sequence list was 
obtained directly from protein sequencing of PCR products. 
The first peptide was derived from a preliminary sequencing 
trial and was determined to be incorrect after the gene was 
cloned. This peptide and the antibodies directed against 
it were subsequently used as controls. The peptide 
injected into rabbits to produce A85 and A86 has one error 
25 {a missing T at the penultimate position) relative to the 
cloned sequence; however, the antibodies against this 
peptide cross react with the 95 kD protein. An N- terminal 
C residue was added to each peptide during synthesis in 
order to couple the peptide to carrier protein. 
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TABLE 1 

Antibodies Generated Against Peptide Sequences 

Antibody # Protein Peptide Sequence Amino 
Directed Acid # 
5 Againat in Sequence 



79 




poor (incorrect) 
sequence 




80 




poor (incorrect) 
sequence 




81 


80kD 


(C) AEGYSDINVRG 


628- 


82 


80kD 


(C)AEGYSDINVRG 


628- 


63 


95kD 


( C ) QNEFQFNNVK 


610- 


84 


95kD 


(C)QNEFQFNNVK 


610- 


85 


95kD 


( C) EFGLEPNILK 


414- 


86 


95kD 


(C)EFGLEPNILK 


414- 



Of these antibodies, those with the highest affinity 
15 for the 80 kD protein (A82) and the 95 kD protein (A86) 
were used to demonstrate that both the 80 kD and 95 kD 
polypeptides co-purified with telomerase activity, 
indicating that these proteins are telomerase components. 
Results of immunoprecipitation studies with the 80 kD 
20 protein are consistent and suggest the 80 kD protein is a 
functional component of telomerase. 

Synthetic Protein Component Genes 

To produce genes that encode the telomerase protein 
components in other eukaryotes, synthetic gene sequences 
25 were constructed in which the Tetrahymena genetic code was 
altered to enable correct translation and transcription in 
organisms having the genetic code which is used/translated 
in most eukaryotes; i.e., mammals such as humans. The 
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genee were synthesized as gene fragments from overlapping 
sets of oligonucleotides (primer sets), which were then 
cloned into plasmids. The full-length genes were 
constructed by combining the fragments in the plasmids. 
5 The P 80 gene was constructed in the plasmid Bluescript; the 
P9S gene in the plasmid pSE280, although any plasmid can be 
used. 

To express the p80 and p95 proteins, the synthesized 
genes were cloned into different restriction sites of the 

10 pRSET and pBlueBac vectors. Transcription and translation 
of the genes in PRSET and pBlueBac generates the 
recombinant proteins in E. coli and baculovirus, 
respectively. A His-tag and cleavage site at the end of 
each recombinant protein facilitates the purification of 

15 the proteins. 

Using B. coli and a vector such as pRSET containing 
the P 80 and p95 gene constructs or, alternatively, 
baculovirus and a vector such as pBlueBac containing the 
same constructs, p80 and p95 can be expressed 
20 recombinant^. Thus, applicants have produced the first 
known bacterial strains or expression vectors which permit 
expression of the P 80 and p95 telomerase protein 
components. One embodiment of this invention is the 
production of one or more recombinant telomerase protein 
25 components in a host cell. One method comprises culturing 
a host cell containing the gene encoding the protein, or a 
homolog thereof, under conditions which permit production 
of the protein. In one embodiment, the method further 
comprises the steps of recovering quantities of protein as 
30 well as purification procedures. Skilled artisans will' 
appreciate the various ways in which recombinant proteins 
of this invention can be prepared. 

The recombinant proteins, or fragments thereof, are 
useful to detect agents that stimulate or inhibit 
35 telomerase catalytic activity. They are also useful to 
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produce antibodies for screening assays, such as to detect 
telomerase activity in tumor cells, or stimulated or 
inhibited production of telomerase in response to exposure 
to a compound. 

5 Also encompassed by this invention is a telomerase 

polypeptide which varies in amino acid sequence from a 
telomerase polypeptide encoded by genomic DNA (i.e., 
differs from a naturally- occurring telomerase polypeptide, 
such as Tetrahymena telomerase polypeptide), without 

10 affecting the ability of the polypeptide to combine with 
the other telomerase protein and RNA components or affect 
the enzymatic activity of telomerase. These variations may 
include additions, deletions, substitutions and other 
alterations (e.g., modification of an amino acid residue) 

15 to the amino acid sequences. 

The genes encoding the Tetrahymena telomerase protein 
component, the synthesized genes, or the primers can be 
used to clone the human telomerase protein component and 
other mammalian telomerase protein components, using known 

20 methods described herein. 

Two approaches can be used to clone the human 
telomerase protein genes with the Tetrahymena, synthesized, 
or primer sequences. These procedures are described in 
detail in the subsequent examples. 

25 In one approach, DNA sequence hybridization is used to 

identify and clone a human homologue of the Tetrahymena 
protein. Human genomic DNA and mRNA blots are probed with 
the Tetrahymena gene at a series of increasing 
stringencies. If specific bands are identified, cDNA or 

30 genomic libraries cloned into phage lambda vectors are 

probed at a similar stringency to identify the gene for the 
human homologue. Positive phage are restriction mapped, 
subcloned and sequenced. 

A second approach is to produce a series of antibodies 

35 to various regions of both the p80 and p95 proteins. The 
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antibodies described above (AS2 and A86) can be used. 
Additional antibodies can also be generated as synthetic 
peptides and fusion proteins and used to identify human 
telomerase proteins by cross-reactivity. Libraries of 
5 human or other mammalian cDNAs which express a portion of 
the protein can be probed with the antibodies to clone the 
human or mammalian genes by standard molecular biology 
procedures. See, Sambrook, et al. (1989) Wolecular Cloning 
- A Laboratory Manual, Cold Spring Harbor Press, Cold 
10 Spring Harbor Laboratory, NY. 

Human telomerase is an excellent target for anti- 
cancer therapy. The availability of the protein components 
for the Tetrahymena enzyme facilitates a thorough 
understanding of telomerase biochemistry and will aid in 
15 the identification of specific ant i- telomerase drugs. 
Telomerase activity has been found in over 70 
immortalized human cell lines and cancer tissues, but few 
human primary somatic cells or tissues. Kim, et al. (1994) 
Science 2**: 2011-2015 . Telomere length maintenance does 
20 not occur in primary human somatic cells that have a 

limited life span. When primary cells divide, either in 
vitro or in vivo, telomere length shortens. Germline cells 
do not show this shortening. Allsopp, et al. (1992) PNAS 
59:10114-10118; Harley, et al. (1990) Nature, 337:331-337; 
25 Vaziri, et al. (1993) Araer. J. Hum. Genet. 52:661-667. 
Although telomerase activity is present in immortalized 
human HeLa cells (Morin, G.B. (1989) Cell 59:521-529), 
telomerase has not been detected in primary fibroblast 
cultures. Applicants established SV40 immortalized lines 
30 from primary human cells to investigate the connection 
between telomerase and telomere shortening, in both 
primary and SV40 transfected human embryonic kidney cells, 
telomeres shortened but telomerase was not detected as the 
cells were passaged. When the culture underwent crisis 
35 most cell lines died; however, in the immortal clones that 
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survived, telomerase activity was detected and the 
telomeres were short but stably maintained. Counter, et 
al. (1992) EMBO J. 11:1921-1929. Similar results were 
obtained with primary mouse cells in culture. Prowse, K.R. 
5 and C.W. Greider (1995) PNAS 52:4818. 

These results suggest that primary cells express 
little or no telomerase activity, but that following 
immortalization, cancer cells reactivate telomerase and 
maintain telomere length. In fact, telomerase activity has 

10 been demonstrated in human ovarian carcinoma cells, but not 
in normal cervical endothelial cells. Counter, et al. 
(1994) PNAS 91:2900-2904. Telomere shortening before 
crisis may be lethal, but those cells that can reactivate 
telomerase maintain telomere length and survive crisis. 

15 This model suggests that if telomerase is required for the 
growth of immortalized cells, telomerase inhibitors may be 
excellent anti-cancer drugs. 

The present work provides a method by which cancers 
may be diagnosed prior to or during clinical manifestation 

20 of symptoms by means of detecting telomerase activity in 
somatic cells that normally do not express telomerase. 
Telomerase mRNA expression in a sample of somatic cells or 
tissue can be detected using DNA or RNA probes; this is 
indicative of expression of telomerase which, in turn, is 

25 an indication of immortal cancer cells since somatic cells 
do not normally produce telomerase. Detection of 
hybridization is an indication of a predisposition to 
cellular immortalization or cancer, or to the presence of 
cancer or immortal cells. 

30 By hybridization, it is meant that DNA and/or RNA 

molecules or portions thereof are used in a hybridization 
analysis to detect complementary polynucleotides under 
conditions of moderate stringency according to methods 
described in Ausubel, et ai., (1994) Current Protocols in 

35 Molecular Biology, (Suppl. 26), John Wiley 6 Sons, Inc. 
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In one embodiment of detecting the presence of 
immortal cells or a predisposition to immortalization in a 
eukaryotic tissue sample or a sample of eukaryotic cells, 
nucleic acids are used as probes or primers. This 
embodiment may comprise the steps of: 

a) obtaining a tissue sample or a sample of cells from the 
eukaryote; and 

b) determining the presence of telomerase in the sample, 
wherein if the sample demonstrates the presence of 
telomerase, immortal cells or the predisposition to 
immortalization is present. The same method may be used to 
detect a predisposition to cancer or the presence of cancer 
cells or tissue. 

Alternatively, the expression of mammalian telomerase 
can be detected using polyclonal or monoclonal antibodies 
to the pBO or p95 polypeptide subunits, to both subunits or 
fragments thereof. An antibody can detect both subunits or 
two antibodies can be used, each of which detects a 
different subunit. For example, a sample of somatic or 
tumor cells from an individual can be contacted with anti- 
telomerase antibodies after the sample has been processed 
or treated to render the telomerase (if present) available 
for binding to the antibody. Binding of the antibody is 
indicative of the presence of telomerase and, thus, an 
indication of cellular immortalization or a predisposition 
to cancer, or the presence of cancer or immortal cells. 

A method using antibodies to detect telomerase in a 
eukaryotic tissue sample or a sample of eukaryotic cells 
may comprise the steps of: 
30 a) obtaining a tissue sample or a sample of cells from the 
eukaryote; and 

b) treating the sample to render telomerase available for 
binding to anti-telomerase antibodies, thereby producing a 
treated sample; 
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c) contacting the treated sample with anti-telomerase 
antibodies against (polyclonal or monoclonal) telomerase; 
and 

d) detecting binding of the antibodies to telomerase , 

5 wherein if binding occurs, telomerase is present. It will 
be appreciated that antibody detection can be useful not 
only to detect cellular immortalization such as occurs with 
the development of cancer cells, and the presence of cancer 
or immortal cells, but also to detect the presence of 

10 foreign eukaryotic cells in the cells and tissues of a 
multicellular organism, as described below. 

The present invention also provides a means for 
developing drugs and pharmaceutical compounds that destroy 
or otherwise inactivate or interfere with the activity of 

15 telomerase. A compound that inhibits or inactivates 

Tetzahymena telomerase activity can also be assessed for 
its effects on mammalian telomerases. The telomerase 
protein component, either with or without the RNA 
component, can be used to screen for drugs and 

20 pharmaceutical compounds effective as anti-cancer and anti- 
microbial agents, as described below. 

Further, since additional telomerase activity may have 
an ant i -aging effect and result in restoration of cells by 
stabilizing telomere length, compounds can be screened for 

25 their ability to stimulate or trigger telomerase activity. 
The protein components can also be combined with the RNA 
component of telomerase to produce a functional telomerase 
molecule which can be delivered to cells by conventional 
methods. Alternatively, DNA encoding a telomerase molecule 

30 can be introduced into target cells by recombinant DNA 
methods and transformation technology. The incorporation 
of extra copies of functional telomerase molecules may 
extend the replicative life span of the host cell by 
stabilizing telomere length. Thus, this invention includes 

35 methods for gene therapy in mammals. 
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Another application of this invention is the 
detection of eukaryotic disease-causing organisms in 
somatic cells and tissues of mammals and treatment of the 
resulting disease. There are many fungi, protozoa, and 
5 even algae that invade the cells and tissues of mammals and 
are the cause of various diseases. Examples of such 
diseases include, but are not limited to, aspergillosis, 
histoplasmosis, candidiasis, paracoccidioidomycosis, 
malaria, trichinosis, filariasis, trypanosomiasis (sleeping 
10 sickness), schistosomiasis, toxoplasmosis, and 

leishmaniasis. These organisms require telomerase and 
express this enzyme as they multiply inside host cells 
which do not normally produce telomerase. The above- 
described methods to detect telomerase can be used to 
15 develop early detection and diagnosis procedures for these 
eukaryotic microbial parasites. 

An example of such a method to detect a disease caused 
by a eukaryotic microbial organism in a tissue sample or a 
sample of eukaryotic cells from an individual may comprise 
20 the steps of: 

a) obtaining a tissue sample or a sample of cells from the 
individual; and 

b) determining the telomerase in the sample, wherein if 
the sample demonstrates telomerase of a eukaryotic microbe, 

25 a disease caused by a eukaryotic microbial organism is 
present . 

The telomerase in the sample can be determined by the use 
of nucleic acid probes or primers, including, but not 
limited to those described herein; or, by the use of 

30 antibodies which bind to a telomerase protein component. 

Furthermore, since mammalian somatic cells do not 
require telomerase, the use of inhibitors of and 
antibiotics against telomerase will provide a method of 
treatment for such diseases that is nontoxic or exhibits 

35 little toxicity to the host. For example, most of the 
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drugs used to treat diseases caused by Trypanosoma species 
can cause serious side effects and even death. Antisense 
RNA to the 80 kD or 95 kD protein component of Trypanosoma 
sp. telomerase or drugs against teloraerase can be used to 
5 inhibit telomerase and thus prevent the multiplication of 
species of this parasite in an individual without affecting 
the host's somatic cells and tissues. Included among 
these pharmaceuticals are antisense nucleic acids that 
inhibit the translation of mRNA encoding the protein 

10 component of telomerase. 

Compounds that inhibit or destroy telomerase activity 
can be formulated into pharmaceutical compositions 
containing a pharmaceutically acceptable carrier and/or 
other excipients using conventional materials and means. 

15 They can be administered to an animal, either human or non- 
human, for therapy of a disease or condition resulting from 
an abnormal level of telomerase activity. Administration 
may be by any conventional route (parenteral, oral, 
inhalation, and the like) using appropriate formulations, 

20 many of which are well known. The compounds can be 

employed in admixture with conventional excipients, such as 
pharmaceutically acceptable organic or inorganic carrier 
substances suitable for parenteral administration that do 
not deleteriously react with the active derivatives. 

25 It will be appreciated that the actual preferred 

amounts of active compound in a specific case will vary 
according to the specific compound being utilized, the 
particular compositions formulated, the mode of 
application, the particular situs of application, and the 

30 individual being treated. Dosages for a given recipient 
will be determined on the basis of individual 
characteristics, such as body size, weight, age and the 
type and severity of the condition being treated. 

It should be noted that the formulations described 

35 herein may be used for veterinary as well as human 
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applications and that the term "individual" or "host" 
should not be construed in a limiting manner. These terms 
include human and nonhuman vertebrates, particularly 
mammals . 

5 in a further aspect, the present invention provides a 

process for producing a recombinant product comprising: 

(a) producing an expression vector which includes DNA which 
encodes a telomerase molecule; 

(b) transfecting or infecting a host cell with the vector; 
10 and 

(c) culturing the transfected or infected cell line to 
produce the encoded telomerase molecule (recombinant 
telomerase) . The standard techniques of molecular biology 
can be used to prepare DNA sequences coding for the RNA and 

15 protein components of telomerase, and for construction of 
vectors with appropriate promoters for enzyme expression in 
a host cell. Suitable host cell/vector systems, 
transfection or infection methods and culture methods are 
well known in the art. These systems may also be used to 

20 produce antibodies to telomerase. 

It will also be appreciated that the methods described 
above may be used to produce transgenic cells, tissues, and 
organisms for use in investigating the role of telomerase 
in eukaryotic organisms, and for therapeutic purposes. 

25 Thus, this invention provides transgenic biological 
materials that comprise the protein components of 
telomerase from eukaryotes, including mammals. 

The present invention will now be illustrated by the 
following examples, which are not intended to be limiting 

30 in any way. 
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Purification of Telomerase 

Tetrahymena thermophila were grown to a density of 4.0 
x 10 5 /ml in PPYS (2% proteose peptone, 0.2V yeast extract, 
5 10 fM FeClj) , harvested by centrif ugation in a GSA rotor 
(Sorvall) , and starved for 18 h in Dryls (1.7 mM NaG^O,, 
1.2 roM NaH 2 P0 4 , 1.3 mM NajHPO^ 2 tnM CaCl 2 ) . Starved cells 
were again harvested, resuspended in T2MQ buffer (20 mM 
Tris-HCl pH 8.0, 1 tnM MgCl 2 , 10% glycerol, 2 mM DTT or 0- 

10 mercaptoethanol, 0.1 mM PMSP, 2 /ig/ml leupeptin, 1 jig/ml 
pepstatin) , and lysed by addition of a final concentration 
of 0.2% NP-40. S-100 extract was obtained by 
centrifugation of lysed cells at 130,000 x g for 50 min at 
40C. All subsequent steps were done at 40C. 

15 S-100 extract derived from l-2xlO u cells 

(approximately 300 L of PPYS culture) was filtered coarsely 
and applied to ceramic HAP (AIC) equilibrated in T2MG. 
Telomerase was eluted with a gradient from 0.2 M KaHPC^ in 
T2MG. Fractions with peak activity were pooled, diluted 

20 with 3 volumes of T2MG, and applied to Spermine agarose 
(Sigma) equilibrated in T2MG with 0.15 M potassium 
glutamate (KC3H3NO4, abbreviated KG) . Telomerase was eluted 
in T2MG with 0.65 M KG. Fractions with peak activity were 
pooled and loaded on a 1 L column of Sepharose CL-6B 

25 (Pharmacia) equilibrated and run in T2MG with 20 mM KG and 
3 mM NaN 3 . Fractions with peak activity were pooled, 
adjusted to 0.4 M KG, and applied to Phenyl Sepharose 
(Pharmacia) equilibrated in T2MG with .0.4 M KG. The column 
was washed in T2MG, then telomerase was eluted in T2MG with 

30 1% Triton X-100. Fractions with peak activity were pooled 
and applied to DEAE agarose (BioRad) equilibrated in T2MG. 
Telomerase was eluted with a gradient or a step to 0.4 M KG 
in T2MG. Fractions with peak activity were sometimes 
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diluted with distilled water and were layered on 15- or 20- 
35% glycerol gradients. Gradients were centrifuged for 20 
h in an SW41 rotor (Beckman) . Glycerol gradient -purified 
telomerase was used in several experiments described in 
5 this invention. 

Telomerase was additionally purified prior to 
proteolytic digests for peptide sequencing. Glycerol 
gradient fractions of peak activity were pooled and applied 
to DEAE agarose equilibrated in T2MG. Telomerase was 
10 eluted in T2MG with 0.4 M KG. Peak fractions were dialyzed 
against T2MG then applied to a 6% acrylamide, 50 mM Tris- 
acetate gel run in 50 mM Tris-acetate buffer, pH 8.0. The 
native gel was run for approximately 12 h at approximately 
250 V. The native gel lane containing telomerase was 
15 excised, soaked briefly in 2X SDS sample buffer (0.125 M 
Tris-HCl pH6.8, 4% SDS, 10% 0-mercaptoethanol, 20% 
glycerol, bromophenol blue) and sealed into the well of a 
denaturing 7% acrylamide gel with 0.1% agarose in mM Tris- 
acetate. SDS-PAGB was performed in Tris -glycine -SDS buffer 
20 (25 mM Tris-HCl pH 8.3, 192 mM glycine, 0.1% SDS). 



Example 2 

AnaiYt&CflU flrnlf Two-Dimensional Gel &n a i y ^ ff 

Fractions were adjusted to at least 10% glycerol and 
loaded on a native gel of 6% acrylamide. 50 mM Tris-acetate 

25 minigel. The native gel was run in 50 mM Tris-acetate 
. buffer, pH 8.0. The native gel lane containing telomerase 
was excised, soaked briefly in 2X SDS sample buffer, and 
sealed into the well of a denaturing 5-15% or 5-20% 
gradient acrylamide minigel with 0.1% agarose in 25 mM 

30 Tris-acetate. SDS-PAGE was performed in Tris -glycine -SDS 
buffer (25 mM Tris-HCl pH 8.3, 192 mM glycine, 0.1% SDS). 
After electrophoresis, gels were soaked 2 x 10 min in 50% 
methanol and equilibrated in 5% methanol. Silver staining 
was performed by incubation of the gel in 0.1 mM DTT for 20 
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min, 1 mg/ml silver nitrate for 20 min, and development in 
0*28 M Na 2 C0 3 , 0.0185% formaldehyde. The staining reaction 
was quenched with citric acid. 

Example 3 

5 Proteolytic Digesti on of Telomerase Subunits and Peptide 
Sequencing 

The 80 and 95 kD subunits of telomerase were purified 
as described above. Preparative SDS gels containing 
telomerase were stained in 0.05% Coomassie brilliant blue 

10 (Aldrich) , 20% methanol, 0.5% acetic acid and destained in 
methanol -acetic acid. Polypeptides were excised from the 
gel after soaking 10 min in distilled water. Gel slices 
were crushed and soaked in 50% methanol 2 x 20 min, 
decanted, and dried briefly under vacuum. Proteins were 

15 digested with approximately 300 ng of Achromobacter 

protease I in 0.1 M Tris-HCl pH 9.0, 0.01% Tween-20 for 24 
h at 37°C. Peptides were separated from gel fragments by 
spin filtration, concentrated by Speed-Vac, and applied to 
a C-18 column (Vydac) . Peptides were eluted with a 

20 gradient of acetonitrile:isopropanol (3:1) in 0.09% 

trifluoroacetic acid. Peaks of absorbance at 214 nm were 
collected, lyophilized, and applied to a protein sequencer 
(ABI) . 

Example 4 

25 Cloning of Genes for the 80 and 95 kD Telomerase Subunits 
Degenerate primers were designed from peptide 
sequences with consideration of Tetrahymena codon usage 
frequencies. Martindale, D.W. (1989) J. Protozol. 36:29- 
34. These primers were used in multiple combinations under 

30 a variety of PCR conditions. Templates for PGR included 
Tetrahymena macronuclear genomic DNA, or total or poly-A* 
RNA (prepared as described in Ausubel, et al., (1992) 
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Current Protocols in Molecular Biology, John Wiley & Sons, 
Inc. and Sambrook, et al., supra; from Tetrahymena grown 
and Btarved as described above) . Products from PCR 
amplification were purified, cloned in E. coli. and 
5 sequenced by standard protocols. PCR products were 

confirmed to derive from p80 or p95 gene or cDNA if the PCR 
product encoded additional peptide sequence not specified 
by the PCR primer, either as an entirely internal peptide 
or as sequence adjacent to that specified by the degenerate 
10 PCR primer used in the reaction. 

PCR products were used to screen an oligo dT-primed 
Tetrahymena cDNA library in XgtlO (Takemasa, et al. (1989) 
J. Biol. Chem. 264: 19293-19301) . Only partial clones (0.8 
kb or less) were obtained. Genomic libraries were 
15 constructed in Bluescript KS+ (Stratagene) with EcoRI or 
Clal digested Tetrahymena genomic DNA. A 3.2 kb clone was 
obtained that contained most of the gene for p80. A 1.1 kb 
clone was obtained that contained an internal region of 
coding sequence for the p95 gene. Fragments containing 
20 other portions of the p95 gene were detected by Southern 
blot of EcoRI digested genomic DNA, but were drastically 
under-represented in the constructed libraries. To obtain 
the 5' end of the cDNAs for both genes, a RACE protocol was 
followed (Gibco # 18374-025) using poly-A+ RNA from starved 
25 Tetrahymena. To determine the 3' end of the cDNA for p95, 
lambda clone sequences were compared with sequence obtained 
by 3' RACE. The 3' RACE was performed based on the 
protocol above, using oligo dT priming from the mRNA poly- 
A+ tail for reverse transcription, combined with priming 
30 from within the known sequence of the genomic clone for 
PCR. To determine the 3' end of the p80 cDNA, the 
sequences of lambda and genomic clones were compared; the 
lambda clones obtained for p80 terminate at the 3' end with 
poly-A sequence present only as four adenine residues in 
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the genomic clone. The results of 3' RACE for p80 support 
this region as the site of polyadenylation. 

Previous determination of the nucleic acid sequence of 
p95 had indicated the nucleotide at position 405 to be M G M 
5 and nucleotides at positions 977-979 to be "CGT" . This 
resulted in an "R" instead of "Q" and "A n , respectively, in 
the encoded protein. 

Example 5 

Generation of Antibodies to the 80 kP anfl 35 frP 

10 Polypeptides 

Synthetic peptides were synthesized that corresponded 
to two different regions from each protein. Peptides were 
purchased from Genosys Biotechnologies. These peptides 
were coupled to Keyhole Lymphet Heraocyanine (KLH) carrier 

15 protein via an amino terminal additional cystine residue 

using standard protocols. Harlow, et al. (1988) Antibodies 
- A Laboratory Manual, Cold Spring Harbor Press, Cold 
Spring Harbor, NY. Each of the peptides coupled to KLH 
protein were injected into two separate rabbits using 

20 standard protocols including periodic boosts with the 

antigen. Harlow, et al., supra. Sera from the rabbits was 
sampled every several weeks. The animal injections were 
generated at Hazelton Corporation. Sera from all of the 
rabbits after 3-4 boosts with the antigen was obtained and 

25 initially tested in ELISA assays against the synthetic 

peptides used to inject each rabbit. Several of the crude 
sera specifically recognized the peptides. The antibodies 
were then tested against total Tetrahymena and purified 
telomerase fractions on Western blots. Two rabbits 

30 immunized with one peptide had very good titre against the 
80 kD protein. The antibodies from each of these rabbits 
are designated A81 and A62. Similarly, four rabbits 
immunized with two peptides had a good titre against the 95 
kD protein and these antibodies are designated A83, A84, 
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A85 and A86. A82 had the highest affinity for the 80 kD 
protein and A86 had the highest affinity for the 95 kD 
protein ♦ 

The antibodies were then affinity purified by binding 
5 to a column with the specific peptide coupled to it. 
Antibodies were eluted from the column first with an 
acetate buffer (0.1 M NaOAc , pH 4.0) and subsequently with 
a glycine buffer (0.1 M glycine, pH 2.7) to remove the 
tighter binding antibodies. The affinity purified 
10 antibodies were used for both western blots and immuno- 
precipitation. Western analysis with sera containing A82 
and A86 antibodies showed that both the 80 and 95 kD 
polypeptides co-purified with telomerase activity 
throughout the entire column purification scheme (see 
15 Example 1) . The level of both the 95 and 80 kD protein 
paralleled the fold increase in enzyme activity at each 
stage in the purification, consistent with these proteins 
being telomerase components. 

Telomerase activity was specifically immuno- 
20 precipitated by the highest affinity antibody directed 

against the 80 kD protein (A82) . Immuno-precipitation was 
carried out using standard techniques (Harlow, et al., 
supra) . The affinity purified antibody was incubated with 
agarose beads (Pharmacia) coupled to protein G. After the 
25 initial binding reaction the highly purified fraction from 
a non-peak region of the glycerol gradient (see Example 1) , 
was incubated with the beads and telomerase was allowed to 
bind for 3-4 hours at 4°C. The beads were then spun at a 
very low speed in an eppendorf tube and the supernatant was 
30 removed. The beads were washed three times in T2MG plus 
0.1 M KG 0.5% NP-40 and resuspended in the same buffer. 
Telomerase activity was then assayed in the supernatant, 
the final wash and pellet fraction for each antibody. 
Antibody A82 showed telomerase activity in the pellet and 
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the supernatant was depleted for activity. As a control, 
an affinity purified antibody which did not recognize 
either the 60 or 95 kD on Western blots (A80) was used in 
the imrauno-precipitation. With this antibody, activity 
5 remained in the supernatant as was the case with the lower 
affinity antibodies directed against the other 80 and 95 kD 
polypeptides (A83, A84, A85 and A86) . These results 
indicate that the 80 kD polypeptide is a functional 
component of teloraerase. 

10 Example $ 

Synthesis of Genes for the 80 and 95 kD Telomerase Subunits 

To express the Tetrahymena proteins in any organism 
besides ciliates, the Tetrahymena codons that use UAA and 
UAG to encode glutaraine (Martindale, D.W. (1989) J. 

15 Protozol. 36:29-34) must be replaced. In most eukaryotes 
these codons denote "stop"; thus, their translation 
prevents expression of full length proteins. Because there 
are 44 glutaraine codons in the p95 gene and 18 glutamine 
sites in the p80 gene that require change, these genes were 

20 synthesized de novo rather than use site-directed 

mutagenesis to make each substitution. To construct the 
synthetic genes, it was first established which codon would 
be used to code for each amino acid. These codons were 
chosen by their frequency of use in E. coli and baculovirus 

25 (Rohrmann, G.F. (1986) J. Gen. Virol. 67:1499-1513; Zhang, 
G. et al. (1991) Gene 105:61-72), two efficient systems for 
expressing recombinant proteins. The list of codons chosen 
is shown in Table 2. 
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TABLB 2 

Codonfl used for GenarafcioTi of Svnthgfc^ 
Telomerafift pfiO and p95 Proteina 



Amino agi 


} 




Ala 


GCT 


GCC 


GCA 


22° 


Arg 


CGT 


CGC 






Asn 


AAC 








Asp 


GAC 








Cys 


TGT 








Gin 


CAA 








Glu 


GAG 


GAA 






Gly 


GGC 









His 


CAC 


CAT 






lie 


ATC 








Leu 


CTG 








Lys 


AAG 








Net 


ATG 








Phe 


TTC 








Pro 


CCG 








Ser 


AGC 


TCA 


TCC 




Thr 


ACC 


ACT 






Trp 


TGG 








Tyr 


TAC 








Val | GTT 


GTG 


GTC 1 


GTA 



10 



15 



20 



25 



30 



A GeneWorks (IntelliGenetics) computer program was 
used to "reverse translate" the protein sequence of both 
p80 and p95 using the codons in Table 2 for the genetic 
code. This created a somewhat degenerate DNA sequence due 
to the degeneracy of the genetic code chosen. The 
predicted restriction map from this degenerate sequence was 
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then examined to find unique restriction sites 
approximately every 300 bp in each gene sequence. Where 
necessary, restriction sites were eliminated individually 
by choosing a different codon to encode a particular amino 
5 acid. This process was re-iterated until a unique DNA 

sequence was obtained that had the appropriate placement of 
restriction sites. The final DNA sequence including 
engineered restriction sites at the 5' and 3' end used for 
cloning are shown in Figures 10 and 11. 

10 To synthesize the two genes shown in Figures 10 and 

11, a set of 28 overlapping oligonucleotides were 
synthesized for p60 (primer sets 3 and 4, Figures 7 and 8, 
respectively) and a set of 34 were synthesized for p95 
(primer sets 1, 2 and 3, Figures 5, 6 and 7, respectively) 

15 and the genes were constructed by overlap extension PGR. 
Prodromou, C. and L.H. Pearl (1992) Protein Engineering 
5:827-829; Bambot, S.B. and A.J. Russell (1993) PCR Meths. 
and Appa. 2:266-271. The oligonucleotides were purchased 
from Bioserve Biotechnologies (Laurel, MD) . Each 

20 oligonucleotide is approximately 100 nt long and is 

designed to overlap with its compliment to give a hybrid of 
20 base pairs. Each oligonucleotide also has a 
phosphorothioate linkage in place of the usual 
phosphodiester at the 3' end of the oligonucleotide. This 

25 phosphorothioate will prevent exonuclease removal of the 20 
bp hybrid overlap during the initial polymerase elongation 
step. Skerr, A. (1992) WUcl. Acid Res. 20:3551-3554. 

The p80 gene was constructed in two pieces by 
combining the first set of oligonucleotides (primer set 4, 

30 Figures 8A-8B) pair-wise, then using PCR to amplify the 
entire region as described. The second half was 
constructed in a similar manner using primer set 5 (Figure 
9) . Each half of the gene was cloned into the plasmid 
Bluescript and sequenced in its entirety to be sure no new 

35 mutations were introduced. The 5' half was cloned on a Bam 
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HI-BcoRI fragment and the 3' half was cloned on a EcoRl- 
Kpnl fragment. The full-length gene was then constructed 
by combining the two fragments in pBlueecript (Stratagene) 
in the appropriate order • 
5 The p95 gene was constructed in three pieces by 

combining each set of oligonucleotides pair-wise and then 
using PCR to amplify the entire region. The first fragment 
was constructed using primer set 1, the second fragment was 
constructed with primer set 2 and the third with primer set 
10 3. Each of the three fragments of the gene were cloned 
into the plasmid pSE2B0 (Invitrogen) and sequenced in its 
entirety. The 5' fragment (Fragment and primer set 1) was 
cloned on a Ncol -Bs tBI fragment, the internal piece 
(Fragment and primer set 2) was on a BstBI-EcoRI fragment, 
15 and the 3' fragment (Fragment and primer set 3) was cloned 
on an EcoRl-Hindlll fragment. The full-length gene was 
then constructed by first combining fragments 1 and 2 in 
the pSE280 plasmid, and subsequently adding the 3' fragment 
to complete the gene. 

20 Example 7 

BPCffregg j Qn and Purification of Recombinant nfi Q and P 9S 

The p80 and p95 proteins is expressed in E. coli and 
baculovirus by cloning the full length construct into pRSET 

25 and pBlueBac vectors respectively (Invitrogen) by methods 
known to those of skill in the art (See, e.g., Ausubel, 
supra: Sambrook, supra) . These vectors allow expression 
and purification of the recombinant proteins. The p80 is 
cloned into the BamHI and Hindlll sites of pRSET and 

30 pBlueBac and the p95 is cloned into the Ncol and Hindlll 
sites. Transcription and translation of these constructs 
generates the recombinant proteins followed by a series of 
6 histidines (His-tag) , separated by an EK (Enterokinase) 
protease cleavage site. This allows each protein to be 
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purif led by processing over a Ni** chelating column. The 
purified protein is removed from the His tag by digestion 
with the protease Enterokinase . 

Example 8 

5 Cloning of Human Te^tnerafie Protein Components 

Two general approaches can be taken to cloning the 
human genes: DNA sequence based approaches and antibody 
directed approaches. The first approach takes advantage of 
the DNA sequence of the Tetrahymena genes to directly 

10 identify the human homologues. Those skilled in the art 
will recognize three different strategies that are used to 
clone homologues based on DNA sequence: (1) direct 
hybridization of human genomic or cDNA libraries with the 
Tetrahymena gene; (2) identification of conserved regions 

15 in telomerase protein in other species and PCR 

amplification of a human gene based on these regions; and 
(3) systematic strategy to saturate all regions of the 
telomerase genes with PCR probes and identification of a 
human homologue using PCR to "walk" along the length of the 

20 gene. All of the methodology is based on standard 

molecular genetic laboratory procedures (Sambrook, et al., 
supra) . 

In the first strategy, the Tetrahymena gene is used to 
probe human genomic DNA and mRNA blots at a series of 

25 increasing stringencies. When specific bands are 

identified, the cDNA or genomic library can be probed at a 
similar stringency to identify the gene for the human 
homologue. Positive phage is then restriction mapped, 
subcloned, and sequenced. 

30 The second strategy involves cloning the telomerase 

proteins from other ciliates first because telomerase 
proteins may have only limited conservation at the DNA 
sequence level between humans and Tetrahyraena. Then the 
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mammalian counterparts are cloned using information 
obtained from these cDNAs. Thus, using the cloned 
Tetrahymena genes, libraries from distantly related 
Tetrahymena or Oxytricha and Buplotea can be probed at 
5 medium stringency to identify genes which cross hybridize. 
Since the ciliates Oxytricha and Euplotes have telomerase 
enzymes which are functionally similar to the Tetrahymena 
telomerase (Lingner, et al. (1994) Genea Dev. 0:1984*1998; 
Shippen-Lentz, D. and E.H. Blackburn (1990) Science 247: 
10 546-552) , it is likely that homologue proteins can be 
identified with this method. The genes for both the p95 
and p80 homologues from both ciliates can be fully 
sequenced and regions of the highest degree of similarity 
between the different species can be identified. Three 
15 conserved regions are chosen for each protein to use in 
Reverse transcriptase PCR- based approaches to cloning the 
human gene. The same approach is taken to clone the genes 
for both the p80 and p95 genes. Degenerate 
oligonucleotides encoding the conserved regions in the 
20 Tetrahymena, Oxytricha and Euplotes telomerase proteins are 
synthesized using human translational codon biases. PCR is 
initially carried out with two of the three 
oligonucleotides. The 3' most oligonucleotide (oligo 1) 
can be complementary to the mRNA. Thus cDNA is synthesized 
25 from isolated mRNA using oligo 1 as a primer. The 5' most 
oligo (oligo 2) can be oriented 5' to 3' in the direction 
opposite to oligo 1 and can be identical in sequence to the 
mRNA strand. This oligonucleotide is then used in a PCR 
step along with oligo 1 to amplify the region between oligo 
30 l and oligo 2. Finally a third primer (oligo 3) is 

directed against a conserved region of the protein which 
lies between the regions targeted by oligos 1 and 2. The 
sequence of the oligo will be complementary to the mRNA. 
The PCR product amplified with oligos 1 and 2 is 
35 reamplified using oligo 2 and oligo 3. This is to assure 
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that the specific products generated all have three 
conserved regions of the telomerase proteins. The PCR 
products are sequenced to identify those that contain 
protein homologues. 
5 The third strategy is a systematic scanning approach 

to find regions of homology between the Tetrahymena and 
human genes, and then to amplify these regions by PCR. 
Using the protein sequence of the Tetrahymena genes as a 
guide, a series of primer oligonucleotides is generated 

10 that encode regions of the Tetrahymena genes yet utilize 

the human codon bias in the DNA sequence. Initially, a set 
of primers differing in the region to which they hybridize 
by 10 amino acids, is generated against the 3' end of the 
gene. These are used in an RT PCR reaction to generate 

15 cDNA. Next, a set of primers oriented from the 5' of the 
gene toward the 3' end are used to amplify the cDNA. All 
possible combinations of two PCR primers from the 5' end 
and 3' end can be used together to identify bands that are 
the size expected for the regions in the Tetrahymena 

20 protein. If specific products are generated they are 
reamplified using primers that should anneal within the 
initial two primers. Specific products which are not 
generated by any primer alone are subcloned and sequenced. 



Example 9 

25 Antibody directed approaches to cloning human telomerase 

This method takes advantage of conserved epitopes on 
the protein surface that are recognized by antibodies. A 
series of antibodies are made to various regions of both 
30 the p80 and p95 proteins. Antibodies such as those 
described in Example 5 can be used. The antigens for 
antibody production are generated as synthetic peptides or 
as fusion proteins. It is faster to produce synthetic 
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pep tides, since the fusion proteins do not have to be 
generated first; however, these peptides may not give as 
high a titer compared to synthetic peptides. Approximately 
10-15 residues of the peptides are selected, preferably at 
5 the 5' and 3' ends of the protein since these residues are 
likely to be unstructured and thus antigenic. In addition, 
computer analysis can be applied to determine the 
hydrophi licit y and predicted secondary structure of the 
protein to choose regions which are likely to be 
10 unstructured and near the surface of the protein. The 

synthetic peptide is coupled to a carrier such as KLH and 
used to inoculate rabbits. If the appropriate amino acids 
for coupling to the carrier are not present in the peptide, 
a linker cysteine residue is added to the N- terminus. 
15 Fusion proteins are generated with the T7 polymerase 

system (Studier, et al. (1990) Meth. Enzymol, 255:60-89) 
and purified for inoculation into rabbits or mice. The 
sera is then screened on Western blots using extracts from 
E. coli expressing the cloned protein or purified 
20 fractions. The positive antibodies are tested for their 
ability to recognize the 95 or 80 kD proteins on Western 
blots and/or to specifically iramunoprecipitate telomerase 
RNA or otherwise inhibit telomerase activity. As a 
control, the ability of the anti-peptide antibody to 
25 precipitate telomerase RNA should be abolished when it is 
pre-incubated with the peptide. Because mouse and rabbit 
sera and monoclonal culture medium inhibit telomerase 
activity (L. Harrington, unpublished results) , affinity 
purified IgG is used to test the ability of the antibodies 
30 to inhibit telomerase activity. 

Both polyclonal and monoclonal antibodies against 
Tetrahymena telomerase proteins can be used to identify 
cross-reacting telomerase proteins from human and mouse 
cells on Western blots. If a positive signal is found, 
35 purified fractions of human telomerase is used to determine 
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if the reactive band co-purifies with teloraerase activity. 
Evidence of co -purification indicates that the cross- 
reacting band is a component of human telomerase. 
Antibodies which give the best signal are then used to 
5 probe expression libraries of lambda GT11. If monoclonal 
antibodies are used, several different antibodies are 
pooled for probing the expression libraries. For 
polyclonal antibodies, two or three different antibodies 
are used on duplicate plates. Only those phage which light 
10 up with both probes are considered positive. These plaques 
are purified and the inserts subcloned and sequenced. 

Example 10 
Cloning and us e of the mouse homologue 

The same two procedures of DNA homology or antibody 

15 cross-reactivity describe above can also be used in 
parallel to identify the mouse telomerase protein 
components. Mouse telomerase clones can be useful in 
testing cancer therapies and for understanding the biology 
of mammalian telomerase. Identification of mouse 

20 telomerase will also allow the use of transgenic mice to 
test the roles of telomere length and telomerase in vivo. 
Once either the mouse or human homologue has been 
identified, either clone is applied to deduce the sequence 
of the other organism. Sequence similarity is high between 

25 human and mouse genes making it a straightforward process 
to obtain the clone for one with a probe from the other. 
Both genomic and cDNA libraries are then plated and probed 
at a moderate stringency to identify cross hybridizing 
plaques. The positive plaques are then* selected, 

30 restriction mapped and sequenced, to determine if the 
telomerase protein homologue has been cloned. Further 
functional analysis, such as reconstitution and gene 
disruption is then applied with the human and mouse clones. 
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Those skilled in the art will recognize, or be able to 
ascertain using no more than routine experimentation, many 
equivalents to the specific embodiments of the invention 
5 described specifically herein. Such equivalents are 
intended to be encompassed in the scope of the following 
claims . 
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Claima 

We claim: 

1. A substantially pure telomerase protein component. 

2. A substantially pure protein component of Claim 1, 

5 which is a Tetrahymena telomerase protein component. 

3 . Isolated DNA which encodes a telomerase protein 
component and is identical to or substantially 
homologous to the nucleotide sequence of SEQ ID NO:l 
or SEQ ID NO: 3. 

10 4. DNA which hybridizes under moderate stringency 
conditions to the DNA according to Claim 3 . 

5. Isolated RNA transcribed from or complementary to the 
DNA of Claim 3. 

6. Isolated RNA transcribed from or complementary to the 
15 DNA of Claim 4. 

7. A polypeptide encoded by DNA of Claim 3. 

8. An isolated polypeptide comprising the amino acid 
sequence of SEQ ID NO: 5. 

9. An isolated polypeptide comprising the amino acid 
20 sequence of SEQ ID NO: 7. 

10. Isolated DNA which encodes a polypeptide identical to 
or substantially equivalent to the amino acid sequence 
Of SEQ ID NO:2 or SEQ ID NO:4. 
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11 An anti-telomerase antibody which binds a peptide 
comprising an amino acid sequence selected from the 
group consisting of: AEGYSDINVRG, QNEPQFNNVK and 
EFGLEPNILT. 

5 12, An anti-telomerase antibody which binds all or a 

portion of a substantially pure telomerase protein 
component . 



. A method of detecting the presence of immortal cells 
or a predisposition to immortalization of cells in a 
eukaryotic tissue sample or a sample of eukaryotic 
cells, comprising the steps of: 

a) obtaining a tissue sample or a sample of 
cells from the eukaryote; and 

b) determining the presence of telomerase in the 
sample, wherein if the sample demonstrates presence of 
telomerase, immortal cells or the predisposition to 
immortalization is present. 



I. A method of detecting a disease caused by a eukaryotic 
microbial organism in a eukaryotic tissue sample or a 
sample of eukaryotic cells, comprising the steps of: 

a) obtaining a tissue sample or a sample of 
cells from the eukaryote ; and 

b) determining the telomerase in the sample, 
wherein if the sample demonstrates telomerase of a 
eukaryotic microbe, a disease caused by a eukaryotic 
microbial organism is present. 

. A method of detecting telomerase in a eukaryotic 
tissue sample or a sample of eukaryotic cells, 
comprising the steps of: 

a) obtaining a tissue sample or a sample of 
cells from the eukaryote; 
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b) treating the sample to render telomerase 
available for binding by anti- telomerase antibodies, 
thereby producing a treated sample; 

c) contacting the treated sample with anti- 
5 telomerase antibodies; and 

d) detecting binding of the antibodies to 
telomerase, wherein if binding occurs, telomerase is 
present . 

16. The method of Claim 15, wherein the presence of 

10 telomerase indicates a predisposition to cancer or the 

presence of cancer. 

17. The method of Claim 15, wherein the presence of 
telomerase indicates the presence of immortal cells or 
a predisposition to immortalization. 

15 18. The method of Claim 15, wherein the presence of 
telomerase of a eukaryotic microbe indicates the 
presence of a disease caused by a eukaryotic microbial 
organism. 

19. A method of identifying a compound that inhibits, 
20 destroys, or interferes with telomerase activity in 

eukaryotic cells, comprising administering the 
compound to a Tetrahymena cell and measuring activity 
of Tetrahymena telomerase in the cell, wherein if the 
Tetrahymena telomerase activity is reduced, the 
25 compound is a telomerase inhibitor. 



20. A method of identifying a compound that inhibits, 

destroys, or interferes with telomerase activity in 
mammalian, including human, cells, comprising 
administering the compound to eukaryotic cells 
30 expressing telomerase activity and measuring the 



WO 96/19580 



PCT/US95/16531 



-41- 



activity of telomerase, wherein if the telomerase 
activity is reduced, said compound or compounds are 
identified as telomerase inhibitors. 

21. A therapeutic or diagnostic compound comprising an 
inhibitor of telomerase activity in combination with a 
pharmaceutically acceptable carrier, diluent or 
excipient . 

22. A therapeutic or diagnostic compound comprising an 
amino acid sequence encoded by isolated DMA according 
to Claim 10 in combination with a pharmaceutically 
acceptable carrier, diluent, or excipient. 

23. A pharmaceutical composition comprising all or 
substantially all of one or both polypeptides 

15 according to SEQ ID NO: 2 or SEQ ID NO: 4, in 

combination with a pharmaceutically acceptable 
diluent, excipient, or carrier. 



10 



24. 

20 



A process for the preparation of a therapeutic or 
diagnostic composition comprising combining an anti- 
telomerase compound together with a pharmaceutically 
acceptable excipient, diluent, or carrier. 



25. A method for treating a disease caused by a eukaryotic 
microorganism in a mammal comprising administering to 
the mammal an amount of a telomerase inhibitor 

25 effective to inhibit telomerase activity in the 

mi croorgani sm . 

26. A method of inhibiting the activity of eukaryotic 
microbial parasites, especially fungal and protozoan 
parasites, in mammals comprising administering to the 

30 mammal an amount of a telomerase inhibitor effective 
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to inhibit the activity of the eukaryotic microbial 
parasites. 

27. A method of therapeutic treatment of a human or animal 
suffering with a disorder associated with an abnormal 

5 level of telomerase activity comprising inhibiting the 

production of telomerase if said level is too high or 
administering telomerase if said level is too low. 

28. A transgenic eukaryotic cell or organism containing 
the DNA sequence of Claim 3 or a sequence 

10 complementary to said sequence. 

29. A transgenic prokaryotic cell containing the DNA 
sequence of Claim 3 or a sequence complementary to 
said sequence. 

30. A transgenic eukaryotic cell or organism containing 
15 the nucleotide sequences SEQ ID N0:1 and SEQ ID NO: 3. 

31. A process for producing recombinant telomerase, 
comprising the steps of: 

(a) producing an expression vector which includes 
DNA which encodes a telomerase molecule; 
20 (b) transfecting or infecting a host cell with 

the vector; and 

(c) culturing the transfected or infected cell 
line to produce the encoded telomerase. 

32. A process for producing a recombinant anti- telomerase 
25 antibody comprising: 

(a) producing an expression vector which includes 
DNA which encodes an ant i- telomerase antibody; 
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10 



(b) transfecting or infecting a host cell with 
the vector, thereby producing a transfected or 
infected host cell; and 

(c) culturing the transfected or infected cell to 
produce the ant i- telomerase antibody. 

33. A DNA sequence comprising SEQ ID NO: 8 or SEQ ID NO: 9. 

34. a synthetic telomerase protein component encoded by a 
DNA sequence according to Claim 33. 

35. A DNA or RNA sequence that hybridizes to a DNA 
sequence according to Claim 33. 

36. An expression vector comprising DNA selected from the 
group consisting of any of the primers 1-34, F1-F14, 
and R1-R14. 



37 

15 



A host cell comprising an expression vector comprising 
DNA encoding a telomerase protein component or a 
fragment thereof. 



20 



38. A method for producing a telomerase protein component 
or a fragment thereof, comprising the step of 
culturing a host cell of Claim 3 7 under conditions 
which permit production of the telomerase protein 
component or fragment thereof. 

39. A method according to Claim 38 wherein two or more 
telomerase protein components or fragments are 
produced in the same cell. 

25 40. The method of Claim 38, further comprising the step of 
purifying the telomerase protein component or 
fragment . 
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1 aactcattta attaetaatt taatcaaeaa gattgataaa aagcagtaaa taaaacccaa 
61 tagatttaat ttagaaagta tcaattgaaa aatggaaatt gaaaacaact aagcacaata 
? c ? aaaa « cc gaaaaattgt ggtgggaaet tgaattagag atgcaagaaa accaaaatga 
1*7 f ^f5 aAgtt *9Wttaaga ttgacgatcc taageaatat ctcgtgaacg tcactgcagc 
If* f^fltttgttg taggaaggta gttactaeta agataaagat gaaagaagat atatcatcac 
301 taaagcactt cttgaggtgg ctgagtctga tcctgagttc atctgctagt tggcagteta 
361 catccgtaat gaactttaca tcagaactac cactaactac attgtagcat tttgtgttgt 
;2: f CMM ' Mt acteaaccat tcatcgaaaa gtaettcaac aaagcagtac ttttgcctaa 
481 tgacttactg gaagtctgtg aatttgcata ggttctctat atttttgatg caaetgaatt 
If} f**??**" 9 tatctt 9*ta ggataettte ataagatatt egtaaggaac tcactttccg 
601 taagtgttta eaaagatgcg tcagaagcaa gttttetgaa ttcaacgaat actaacttgg 
661 taagtattgc actgaatcct aacgtaagaa aacaatgttc cgttacctct cagttaccaa 
721 caagtaaaag tgggattaaa ctaagaagaa gagaaaagag aatctcttaa ccaaacttta 
781 ggcaataaag gaatctgaag ataagtccaa gagagaaaet ggagacataa tgaacgttga 
841 agatgcaatc aaggctttaa aaccageagt tatgaagaaa atagccaaga gatagaatgc 
901 catgaagaaa cacatgaagg cacctaaaat tcctaactct accttggaat caaagtactt 
iaSi 9*ccttcaag gatctcatta agttctgcca tatttctgag cctaaagaaa gagtctataa 
Tao? J* tc f t Jff9t aaaaaatacc ctaagaccga agaggaatac aaagcagcet ttggtgattc 
1081 tgcatctgca cccttcaatc ctgaattggc tggaaagegt atgaagattg aaatctctaa 
Toa? *«atgggaa aatgaactca gtgcaaaagg caacactgct gaggtttggg ataatttaat 
T**7 "fM??f at t ** ct cccat atatggccat gttacgtaae ttgtctaaea tcttaaaagc 
f \%\ SS* 9 ^ 0 * « atM *f«« «ctctattgt gatcaacaag atttgtgagc ccaaggccgt 
t9a * aactcc aagatgttcc ctctteaatt ctttagtgcc attgaagctg ttaatgaagc 
T*IT *gttactaag ggattcaagg ccaagaagag agaaaatatg aatcttaaag gtcaaatcga 
«9cagtaaag gaagttgttg aaaaaaccga tgaagagaag aaagatatgg agttggagta 
1501 aaccgaagaa ggagaatttg ttaaagtcaa cgaaggaatt ggcaagcaat acattaactc 
HI} fattgaactt gcaatcaaga tageagttaa caagaattta gatgaaatca aaggacacae 
,fZ? *? caatct t* tctgatgttt ctggttctat gagtacctca atgtcaggtg gagccaagaa 
^JS*" 9ttc 9 tactt gtctcgagtg tgcattagte cttggtttga tggtaaaata 
TpftT ff 9 ^?* 9 ** **9tcetcat tctacatctt cagttcacct agttctcaat gcaataagtg 
"acttagaa gttgatctcc ctggagacga actcegtcct tetatgtaaa aacttttgca 
*W*»W» aaacttggtg gtggtactga tttcccctat gagtgcattg atgaatggac 
TiSt " a f* ata " actcacgtag acaatatcgt tattttgtct gatatgatga ttgcagaagg 
" 8 } ftattcagat atcaatgtta gaggcagttc cattgttaac agcatcaaaa agtacaagga 
* 9aa «* aaat cctaacatta aaatctttgc agttgactta gaaggttacg gaaagtgcct 
2101 taatctaggt gatgagttca atgaaaacaa ctaeatcaag atattcggta tgagcgatte 
m5? til?**™! ttcatttcag ccaagcaagg aggagcaaat atggtcgaag ttatcaaaaa 
ctttgccctt caaaaaatag gacaaaagtg agtttcttga gattcttcta taacaaaaat 
2281 ctcaccccac ttttttgttt tattgcatag ccattatgaa atttaaatta ttatctattt 
llf\ "tttaagtta cttacatagt ttatgtatcg cagtctatta gcctattcaa atgattctgc 
2401 aaagaacaaa aaagattaaa a 



FIGURE 1 



WO 96/19580 



FCT/US95/16531 



2/12 



^^PNSTI^SXTLTTlCOLZKTCBISEPKEBVyXIUSKlCTPICTSBBYKAAI^DSASi^ 



FIGURE 2 



WO 96/19580 



PCT/US9S/16S31 



3/12 



* ataaaaaaaa gcaaactaca aagaaaatgt caaggcgtaa 

ill "ataggctc ctataggcaa tga.acaa.t cttgattttg tattacaaa? 

"} tctagaagtt tacaaaagec agattgagea ttataagaec tagtagtaat agatcaaaoa 
III TZl?£t?? ••Sctttta. agttcaaa.a tt..«.?t«g gatggX.ci *f£cl*3* 
241 tgatgatgat gaagaaaaea acteaaataa ataataagaa ttattaagga gagtcaatta 
III ^f^!! 9 ** 9 c *?* ttt " t tgataaaaaa agttggttet aaggtagig. aagatttgaa 
361 tttgaacgaa gatgaaaaea aaaagaatgg aetttetgaa tagcaagtga aagaagagta 
III att !f?* aC9 attact 9«»« aataggttaa gtattaaaat ttigtattta acatwart? 
ill P 0 ****"** ttaaatgaga gtggtggeca tagaagacae agaagagaaa cagattatga 
541 tactgaaaaa tggtttgaaa tateteatga ecaaaaaaat titgtatcaa tttacgceaa 
III gZSSSZZ temt ;5 t » tt WtMettaa agattatttt aataaaaae. .tt.tg.t" 
III ^!" tgt * •» c » tt *» e * gactagaaac tgaagcegaa ttctatgcct ttgatgattt 
III * teac "»« *tcaaactta etaataatte ttactagact gttaacitag acgttiattt 
III tltt£t?t c * ct 9 t J t * c tcgcttgct tag.ttttta ttatcacta? «2,attc«. 

5?* at »**agatctt ettataeaag aaattaatat aattttgaga aaattoatoa 
Ml act.tcttc, cagttgtett ttctcategc eacttalaa* "*tt?«ft 

,! a "'£!f?5 J? 0 ?" 9 ! ^ te *"**ttt agttaaetee teateataaa ttagegttaa 
Inll !2f ta,et *f actetttcte taeagactta aaattagttg acactaaeaa 

1081 agtceaagat tattttaagt tettataaga atteectegt ttgactlatg taagct'ot* 
uil lll~tZ~: 9t ! a ?! 9e !! c **«9ctgt agag.accC a.tgtttt.I tt«H«S* 
llll f** 0 *!*;** cectacetaa tteaattttg atttetactt 

1261 tgttaattta taaeatttga aattagagtt tggattagaa ecaaatattt tgacaaaaca 
llll I aa ^?: a "JSHS!* fc W*taaa ataateaaaa ..tctt.I.t 
llll !! a f^ tac • c f*f e 9* t 9 ctfeaagaaae eteeagaaaa cagatattaa aacaagctac 
liol J!^™" " tctca *" acaataaaaa teaagaagaa actcctgaaa ctaaagatga 
llll ?"* 9c *f" «tWtatgaa attttttgat catctttctg aattaacega 

llll n**2 a 2? a J * tc ;« c « tt * aettgtaagc taceeaagaa atttatgata gcttgcaeia 
llll a ™^? att *?* tc ;» CM atttaaagaa gtteaaatta agttacaaat atgaiatgga 
llll "•'•9 t »*« atggatacat teatagatct taagaatatt tatgaaaect taaacaatl* 
llll tctgttaata tateaaatee tcatggaaac atttcttatg aaetgacaaa 

llll ttlllSttt !!^ tttata gaecttaaac ta.g.attat "c«2g"" 

J^**? 6 "** •»Qt«g«*o0 aattttaatt taataaegtt aaaagtgcaa aaattoaate 
"I* "f£f** ta gaaagettag aagatattga tagtctttgc aaatctattg cttett£aa 
llll mttttm aatgttaata ttategeeag tttgctctat eeeaaeaata tttagaaaaa 
2041 tcctttcaat aageeeaate ttctattttt eaagcaattt gaataattga aaaatttoaa 
llll SllVAltt • tCMet r a ge.lat.ctt Lttetattt e^att^t 

llll J!? 4 !!?! 4 ! ..geattct tttgaaaaga tattatttat talaatatta 

llll tStllttit actaaattat ttaaaaeaet teaatagtta eetgaattaa attaagttta 
till f attMtt * 9 e ** tt *9«»9 aattgactgt gagtgaagta cataagtaag tatgalaaaa 
2401 JfEfJSf* ■ t «" ee * tt •t5t|«9t£t atcaaSaaf cVt'cllVAl 

cctt *»9et» atagattttg aecaaaaeae tgtaagtgat gaetetatta aaaaaatttt 
litl TatttlZ: llltT^ tlatt^Sg. ?tgan'ct! S aa I a cta5 

till fifiiJffi' acgaagaaat ttaagaaett eteaaagett gegaegaaaa 

llll !22!? fettt f r*"* 90 * 1 *«tataaatt eeetetatgt ttaeeaaetg gtaettatta 
2701 IZlltlZSt £ a 2£ a9 £ »5*? ,ttMt taaatattSg tttaaat.aa tatHHtat 
2761 ^? aa ^^ !! 9 f!! att » tfet 9«ataa taeataeaat agtcattttt agtgttttga 
2821 "aaaaateg 9ttatttaat aagtaaataa ttatttttea aCitttttt 
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HSRIWQKKPQAPIGNETNX£FVZ£NLEVY)CSQIEHYXTQQQQX R 
EEDIjajJCFKKQDODGNSCNDODDEEKNSHKQQSIXRRVNQIKQQVQLIKKVGSKVEK 
Dt^NEDENKKNOLSEQQVK£EQLHTIT£EQVKyQtavmiDYQZJ>I^5GGHRiaiRR 
ETDYDTBXWFE I SHDQXNYVS I YANQKTS YCWWLKD YFNKNNYDHLNVS I NRLETEAE 
FYAFDDFSQTIKLTNNSYQTVNIDVNTDNKI^ILAZXRrLLSLERFNILNIRSSYTRN 
QYKFEKIOELZJCTIFAVVPSHIUILQGIHLQVPCEAFQYLVNSSSQISVXDSQLQVYSF 
STDLKLVDTNKVQDYPKFLQEFPRLTHVSQQAX PVSATNAVENLNVLLKKVKHANLNL 
VSIPTQFNFOFYrVNI^HIJCI^FCl^PNILTKOKIiNIJXSIKQSKNlJCFIJlI.NFYTY 
VAQETSRKQIIJCQATTIKNIJCNNKNQEBTPETKDETPSESTSGMKFFDHLSELTELED 
FSVNLQATQEIYDSIjaaXIRSTNIJUCFKLSYKYEKEKSKMDTFIDIJWIYETIJfNIJC 
RCSVNISNPHGNISYELTNXDSTFYKFKLTIJfQELQHAKYTFKQNEFQFKNVKSAXIE 
SSSI^SIXDIDSLCKSIASCKNLQNVNIIASLLYPNNIQKNPFNKPNLLFFKQFEQLK 
NLENVS ZNCILDQHZLNS ISEFLEKNKXIKAFILKRYYLLQYYLDYTKLFKTLQQLPE 
LWQVYINQQLEELTVSEVHKQVWEKHXQKAFYEPLCEFIKESSQTLQLIDFDQWTVSD 
DSIKKIIJSSISESKYHHYIJILKPSQSSSLIKSENEEIQELIJCACDEKGVLVKAYYKFP 
LCLPTCTYYDYKSDRW 
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PRIMER SET 1 

primer 1 Lengths 100 

1 GGGGCCATGG ATCAOCCGTC GTAACCAAAA GAAGCCGCAA OCTCCGATCO 
51 GCAACCAGAC CAACCTGCAC TTCOTTCTOC AAAACCTGGA CGTTTACAAC 

primer2 Lengths 100 

1 CTTCC TTCTT OAACTTCACC ACCTTCACCT CCTCCTCCTT CATTTOTTCT 
51 TOTTGGGTCT TGTAGTGCTC GATTTGGCTC TTGTAAACCT CCAGGTTTTG 

primer3 Length: 99 

1 TOCTGAAGTT CAAGAACCAA GACCAAGACG GGAACAGCGG CAACCACCAC 
51 GACGAOGAGG AGAACAACAG CAACAAGCAA CAAGAGCTCC TCCGTOGTG 

primer4 Lengths 100 

1 CCTCCTCCTT CAGGTTCAGG TCCTTCTCAA CCTTGCTGCC AACCTTCTTG 
51 ATCAGTTGAA CTTGTTGCTT GATTTGGTTA ACACGACGCA GCAGCTCTTO 

pri««r5 Length: 100 

1 CCTGAACCTG AACGAGGACG AGAACAAGAA GAACGGCCTG AGCGAGCAAC 
51 AAGTTAAGGA CGAGCAACTG CGTACCATCA CCGAGGAGCA AGTTAAGTAC 

primes* Length: 100 

1 TCGGTCTCCC GACGGTGACG ACGGTGGCCG CCGCTCTCGT TCAGGTCCAG 
51 TTCGTACTCC ATGTTGAAAA CCAGGTTTT G CTACTTAACT TGCTCCTCGG 

primer7 Length: 99 . 

1 CGTCACCGTC GOGAGACCGA CTACGACACC CAGAAGTGGT TCGAOATCAG 
51 CCACGACCAA AAGAACTACG TTAGCATCTA CGCTAACCAA AAGACCAGC 

primers Length: 100 

1 CTCGGTCTCC AGACGCTTGA TGCTAACGTT CAGGTCGTCG TAGTTGTTCT 
51 TGTTGAAGTA GTCCTTCAGC CACCAACAGT AGCTGGTCTT TTCGTTAGCG 

prime r9 Length: 99 

1 TCAACCGTCT GGAGACCGAG GCTGAGTTCT ACGCTTTCGA CGACTTCAGC 
51 CAAACCATCA AGCTGACCAA CAACAGCTAC CAAACCGTTA ACATCGACC 

primerlO Length: 100 

1 GATGTTCAGG ATGTTGAAAC GCTCCAGGCT CAGCAGGAAA CCCAGCAGAG 
51 CCAGGATACA CAGGTTGTTG TCGAAGTTGA CGTCGATCTT AACGGTTTGG 

primer! 1 Length: 99 

1 CCTTTCAACA TCCTGAACAT CCCTAOCAGC TACACCCGTA ACCAATACAA 
51 CTTCGAAAAG ATCGGCGACC TGCTGGAGAC CATCTTCGCT OTT C TTTTC 

priserl2 Length: 100 

1 TTGGCTOCTO CTGTTAACCA CGTATTGGAA ACCCTCACAC COAACTTGCA 
51 GGTGCATGCC TTGCAGGTGA CCGTGGCTGA AAACAACAGC GAAGATGGTC 
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PRIMER SET 2 

primer 13 length t 99 

1 TGCTTAACAG CAOCACCCAA ATCACCOTTA AGGACACCCA ACTGCAAGTT 
51 TACAGCTTGA GGACCGACCT CAAGCTGGTT GACAOCAACA AGCTTCAAO 

priae rli L ength: 100 

1 OGTTGGTACC CCTAACOGGG ATAGCTTGTT CGCTCAOCTC GGTCAGACGC 
51 GGGAACTCTT GCAGGAACTT CAAGTAGTCT TGAACCTTGT TGCTGTCAAC 

primerl3 Length: 99 

1 COGGTTAGOG CTACCAACGC TGTTGAGAAC CTGAACGTTC TCCTGAAGAA 
51 GGTTAAGGAC GCTAACCTCA ACCTGGTTAG CATCCOGACC GAATTCAAC 

primmr 16 leng th t 100 

1 AGCTTTTGCT TGGTCAGGAT GTTCGGCTCC AGGCCGAACT CCAGCTTCAG 
51 GTGTTGCAGG TTAAOGAAGT AGAAGTOGAA GTTGAATTGG GTCGGGATGC 

primer 17 Length i 98 

1 CATCCTCACC AAGGAAAAGC TGCAGAACCT GCTGCTGAGC ATCAAGCAAA 
51 GCAAGAACCT CAAGTTCCTG OGTCTGAACT TCTACACCTA OGTTCCTC 

priMrl8 Length: 100 

1 ACTCTCCTCT TCCTTCTTGT TGTTCTTCAG GTTCTTGATC GTGGTAGCTT 
51 GCTTCAGGAT TTGCTTAOGG CTGCTCTCTT GAGCAACGTA GGTCTAGAAG 

priori* Length: 100 

* JJi^AOAC TCCGGAGACC AAGGACGAGA CCCCGAGCGA 

51 GAGCACCACC GGCATGAAGT TCTTCGACCA CCTGAGCCAG CTGACCGAGC 

primer20 Length: 100.. 

s } ?S™S£I??F ACGGATCAGC AGCTTCTGCA CGCTGTCCTA GATCTCTTGO 
51 GTAGCTTGCA GGTTAACCCT GAAGTCCTCC AGCTCGCTCA GCTCGCTCAG 

primer21 Length: 100 

fSTS?! 0001 ACCACCXACC TGAAGAAGTT CAAGCTGACC TACAACTACG 
51 AGATGGAGAA GAGCAAGATG OACACCTTCA TCGATCTGAA CAACATCTAC 

primer 2 2 Length: 100 

1 GTTGGTCAGC TCGTAGCTGA TGTTGCCGTO CGGGTTGCTG ATGTTAAOGC 
51 TACAACGCTT CAGCTTGTTC AGGGTCTCGT AGATGTTCTT CAGATCGATO 

pri«er23 Length: 99 

?2£!EI ACGA CCTGAOCAAC AAGGACAGCA CCTTCTACAA GTTCAAOCTG 
51 ACCCTGAACC AAGAGCTGGA ACACCCTAAG TACACCTTCA AGCAAAACG 

primer24 Length: 100 

1 ACACAGGCTC TCCATGTCCT CCAGCCTCTC CAGGCTGCTO CTCTCGATCT 
51 TAGCGCTCTT AAOGTTGTTO AATTGCAATT CGTTTTGCTT GAAGGTGTAC 
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prime* sex 3 

priaer23 Length i 101 

1 AGGACATCGA CAGCCTOTGT AAOACCATCG CCAGCTGTAA GAACCTGCAA 
51 AACGTTAACA TCATOCCTAG CCTGCTGTAC COGAACAACA TCCAAAAGAA 
101 c 

pr±aer26 Lengths 100 

1 TACAOTTGAT GCTAAOGTTC TCCACGTTCT TCAGTTGCTC GAATTGCTTG 
51 AAGAAGAGGA GGTTCGGCTT GTTGAACGGG TTCTTTTGGA TCTTGTTCGG 

pri«er27 Lengths 96 

5?? AACG " A CWTCAACTG TATCCTGGAC CAACACATCC TGAACAGCAT 
51 CAGGGAGTTC CTGGAGAAGA ACAAGAAGAT CAAGGCTTTC ATCCTG 

PTlmmvlB Lengths 100 

«? CCCACTTGTT GCAGGGTCTT GAACACCTTC GTGTAGTCCA 

51 GGTACTATTO CAGCAGGTAG TAACGCTTCA GGATGAAAGC CTTGATCTTC 

priMr29 Length: 101 

* SJfJSSSE 0 CCACCTGAAC CAAGTTTACA TCAACCAACA ACTGGAGGAG 
1*7 CTCACCGTTA GCGAGGTTCA CAAGCAAGTT TCGGAGAACC ACAACCAAAA 
101 G 

primmrSO Len gth s 100 . 

1 ACGGTCTTTT GOTOGAAGTC GATCAGTTGC AGGGTTTGGC TGCTCTCCTT 
51 GATGAACTCA CACAGCGCCT CGTAGAAGGC CTTTTGCTTC TGCTTCTCCC 

priMrJl Lengths 100 

«? AAAACACCGT TAGCCAOGAC AGCATGAAGA ACATCCTCGA 

51 GAGGATCAGC GAGAGCAAGT ACCACCACTA CCTGCGTCTG AACCCGAGCC 

pri*er32 Lengths 100 

* J^CCAGAAC GCOCTTCTCG TCACAAGCCT TCAGCAGCTC TTGGATCTCC 
51 TCGTTCTCGC TCTTGATCAG CCTGCTGCTT TGGCTOGGGT TCAGACGCAG 

Prieer33 Lengths 100 . 

1 CGAGAAGGGC CTTCTCGTTA AGCCTTACTA CAAGTTCCCG CTCTCTCTGC 
51 OGACOGGGAC CTACTACGAC TACAACACCG ACCGTTGGTG AGAGCTCCAC 

pri*er3« Lengths 62 

1 CCOGAAGCTT CCOGGGACTA GTTCTAGAGC GGCCGCCACC GCGGTGGAGC 
51 TCTCACCAAC GG 
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PRIMER SET 4 

PI lengths 82 

1 GGGCGGATCC ATCGAGATCO ACAACAACCA AGCTCAACAA COGAACCCTC 
51 AGAAOCTCTC OTGGGAOCTG CAOCTGCACA TO 

T2 Lengths 100 

1 CTCGTTAACO TTACCOCTCC TTCTCT6CTG CAAGAGGGCA CCTACTACCA 
51 AGACAAGGAC GACCOTOGTT ACATCATCAC CAAGGCTCTO CTGGAGGTTG 

F3 Length! 100 

1 CCCTAOCACC ACCAACTACA TO O TT O CTTT CTOTCTTGTT CACAAGAACA 
51 CCCAACOGTT CATCGAGAAG TACTTCAACA AGGCTGTTCT GCTGCCGAAC 

P4 Lengths 100 

1 CAAGAACCTG TACCTGGACC GTATCCTGAG CCAAGATATC CGTAAGGAGC 
51 TGACCTTCCO TAAGTGTCTG CAACGTTOTG TTOGTAGCAA GTTCAGOGAG 

TS Length i 100 

1 COGTTACCTO AGCGTTACCA ACAAGCAAAA GTGGGACCAA ACCAAGAAGA 
51 AGCGTAAGGA GAACCTGCTG ACCAAGCTGC AAGCTATCAA GGAGACOGAC 

F6 Lengths 100 

1 GAAGCOGGCC OTTATCAAGA AGATCGCTAA GCGTCAAAAC GCTATGAAGA 
51 AGCACATGAA GGCTCCGAAG ATCCCGAACA CCACCCTGGA GAGCAAGTAC 

F7 Lengths 100 . 

1 CAAGATCCTG GGCAAGAAGT ACCCCAAGAC OGAGGAGGAG TACAAGGCTG 
51 CTTTCCGCGA CAGCGCTAGC GCTOOGTTCA ACCOCGAGCT CCCTGCCAAG 

F8 Lengths 100 

1 CTGAGGTTTG GGACAACCTG ATGAGCAGCA ACCAACTGOC GTACATGGCC 
51 ATCCTGOGTA ACCTGAGCAA CATCCTGAAG GCTGGCGTTA GCGACACCAC 

T9 Lengths 100 

1 COCCTGCAAT TCTTCAGCGC TATCGACGCT GTTAACGAGG CGGTTACCAA 
51 GGGCTTCAAG CCTAAGAAGC GTGAGAACAT GAACCTGAAG GGCCAAATCG 

Rl Lengths 97 

1 CCAGCGCTAA CGTTAACCAO GT A TT G CTTC GGGTCGTCGA TCTTAACACG 
51 AACTTGGATG TOGTTTTGGT TCTCTTGCAT CTCCAGCTCC AGCTCCC 

*2 Lengths 100 

1 GTAGTTOGTO GTGGTACGGA TCTACAGCTC GT7ACGGATG TAAACAGCCA 
51 GTTGACAGAT GAACTCOGGG TCGCTCTCAG CAACCTCCAO CAGAGCCTTG 

K3 Lengths 99 

1 CGGTCCAGGT ACAGGTTCTT GAACTCGGTA GCGTCGAAGA TGTACAGAAC 
51 TTGAGCGAAC TCACAAACCT CGAGCAGGTC GTTCGGCAGC AGAACAGCC 

R5 Lengths 99 . 

1 CTTCATAACG GCCGGCTTCA GAGCCTTGAT AGCGTCCTCA ACGTTCATCA 
51 TGTCGCCCGT CTCACGCTTG CTCTTGTCCT CGCTCTCCTT GATAGCTTG 
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H6 length i 100 

1 GTACTTCTTO CCCACCATCT TGTAAACACG TTCCTTCCCC TCCCTGATGT 
51 GACAGAACTT GATGAGGTCC TTGAAGGTCA GGTACTTGCT CTCCAGGGTG 

X7 Length i 99 

1 CAGGTTGTCC CAAACCTCAG CCCTCTTCCC CTTAGCGCTC AGCTCGTTCT 
51 CCCAOGTCTT GCTGATCTCC ATCTTCATAC GCTTGCCAGC CAGCTCCGG 

M Length t 99 

1 CCCTGAACAA TTOCAGCGGG AACATCTTGC TGTTCTCAAC AGCCTTCGCC 
51 TCACAGATCT TGTTCATAAC GATGCTGTGO OTGGTGTCGC TAACGCCAG 

*9 Lengths 101 

1 CCAATTCOCC CTCCTCGCTT TGCTCCAGCT CCATGTCCTT CTTCTCCTOG 
51 TCCGTCTTCT CAACAACCTC CTTAACAGCC TCGATTTCGC CCTTCAGCTT 
101 C 
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PRIMER SET 5 



F10 Lengths 96 

1 CCOAOCACOO CGAATTCGTT AACCTTAACC ACGGCATCGG CAAGCAATAC 
51 ATCAACAGCA TCCAOCTCGC TATCAAGATC GCT6TGAACA AGAACC 

Fll Lengths 99 

1 CATGAGCGGC GGCGCTAAGA AGTACGGGAG CGTTCGTACC TGTCTGGAGT 
51 GTGCTCTGCT TCTGGGCCTG ATGGTTAAOC AACGTTGTGA GAAGAGCAG 

T12 Length i 100 

1 CGGGCGACGA GCTGCGTCCG ACCATGCAAA AGCTGCTCCA AGAGAAGGGC 
51 AAGCTGGGCG GCGGCAOOGA CTTCCOGTAC GAGTGTATCG ATGAGTGGAC 

P13 Length i 100 

1 CTACAGCGAC ATCAACCTTC GTGGGAGCAG CATCGTTAAC AGCATCAAGA 
51 AGXACAAGGA OGAGGTTAAC CCGAACATCA AAATCTTCGC TGTTCACCTC 

F14 Length t B5 

1 CAAAATCTTC GGCATGAGOG ACAGCATCCT GAAGTTCATC AGCGCTAAGC 
51 AAGGCGGCGC TAAGATGGTG GAGGTGATCA AGAAC 

RIO Lengths 100 

1 CTTAGCGCOG CCGCTCATGC TGGTGCTCAT CCTGCCGCTG ACGTCGCXCA 
51 AGATAGCGGT GTGGCCCTTG ATCTOGTCCA GGTTCTTGTT CACAGOGATC 

Rll Lengths 100 

1 CACGCAGCTC GTCGCCCGGC AGGTCAACCT CGAGGTAACA CTTGTTACAT 
51 TGGCTGCTCG GGCTGCTGAA GATGTAGAAG CTGCTCTTCT GACAACGTTG 

R12 Lengths 100 

1 GAAOGTTGAT GTOGCTGTAO CCCTCAGOGA TCATCATGTC GCTCAGGATA 
51 ACGATGTTGT CAAOGTGGGT CtTOTlta' TO GTCCACTCAT CGATAGACTC 

R13 Lengths 98 

1 CGCTCATGCC GAAGATTTTG ATGTAGTTGT TCTCCTTCAA CTCGTOGCCC 
51 AGGTTGAGAC ACTTGCOCTA GCCCTOGAGG TCAACAGCGA AGATTTTG 

R14 Lengths 85 . 

1 GGGCGGTACC AAGCTTTCTA GACTAGTCTG CAGTCACTTT TGGCCGATCT 
51 TTTGCAGAGC GAAGTTCTTG ATCACCTCCA CGATG 
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gggoggatccatggagatcgagaacaaccaagctcaacaaccgaaggctgagaagctgtgg 

tgggagctggacctggagatogaagacaaccaaaacx3acatccaagttcgtgttaagatcg 

aogacccgaaccjataoctqcttaaocttacxx^ctuctix;tct c ctgcaag 

ctaccaagacaaggacgagcgtcgttacatcatcaccaaggctctgctggaggttgctgag 

agogacccggagttcatctgtcaactggctgtttacatccgtaacgagctgtacatccgta 

ccaccaccaactacatcgttgctttctgtgttgttcacaagaacacccaaccgttcatoga 

gaagtacttcaacaaggctgttctgctgccgaacgacctgctggaggtttgtgagttcgc? 

caagttctctacatcttogacgctaccgacttcaagaacctgtacctggaccgtatcctga 

gccaagatatccgtaaggagcxgaccttccgtaagtgtctccaacgttgtgttcgtagcaa 

gttcagogagttcaacgagtaccaactgggcaagtactgtaccgagagccaacctaagaag 

accatgttccgttacctgagcgttaccaacaagcaaaagtgggaccaaaccaagaagaacc 

gtaaggagaacctgctgaccaagctgcaagctatcaaggagagcgaggacaagagcaagcg 

tgagacoggcgacatcatgaacgttgaggaocctatcaaggctctgaagcoggcogttatg 

aagaagatogctaagcgtcaaaaogctatgaagaagcagatgaaggctccgaagatcccga 

acagcaccctggagagcaagtacctoaccttcaaggacctgatcaagttctgtcacatcag 

cgagcogaag oaac otctttacaagatcctgggcaagaagtacccgaagaccgaccagoag 

tacaaggctgctttcggogacagogctagccctcccttcaacccggagctggctggcaagc 

gtatgaagatcgagatcagcaagacctgggagaacgagctgagcgctaagggcaacaccgc 

TGAGGTTTGGGACAACCTGATCAGCAGCAACCAACTGCCGTACATGGCCATGCTGCGTAAC 
CTGAGCAACATCCTGAAGGCTGGOGTTAGCGACACCACCCACAGCATOGTTATCAACAAGA 
TCTGTGAGCOGAAGGCTGTTGAGAACAGCAAGATGTTCCOGCTGCAATTCTTCAGCGCTAT 
CGAGGCTGTTAACGAGGCGGTTACCAAGGGCTTCAAGGCTAA6AAGCGTGAGAACATGAAC 
CTGAAGGGCCAAATOGAGGCTGTTAAGGAGGTTGTTGAGAAGACOGAOGAGGAGAAGAAGG 
ACATGCAGCTGGAGCAAACOGAGGAGGGCGAATTCGTTAAGGTTAACGAGCGCATCGGCAA 
CCAATACATCAACAGCATCGAGCTGCCTATCAACATOGCTGTCAACAAGAACCTGGACGAG 
ATCAAGGGOCACACOGCTATCTTCAGCGACGTCAGCGGCAGCATGAGCACCAGCATGAGOG 
CCGGCGCTAAGAAGTACGGCAGCGTTCGTACCTGTCTGGAGTGTGCTCTGGTTCTGGGCCT 
GATGGTTAAGCAAOGTTGTGAGAAGAGCAGCTTCTACATCTTCAGCAGCCCGAGCAGCCAA 
TOTAACAAGTGTTACCTGGAGGTTGACCTGCCGGGOGAOGAGCTGCGTCOGAGCATGCAAA 
AGCTGCTCCAAGAGAAGGGCAACCTCGGCGGCGCCACOCACTTCCCGTAOGAGTGTATCGA 
TGAGTGGACCAAGAACAAGACCCAOGTTGACAACATCGTTATCCTGAGCGACATCATGATC 
GCTGAGGGCTACAGCGACATCAACGTTCGTGCCAGCAGCATCGTTAACAGCATCAAGAAGT 
ACAAGGAOGAGGTTAACCCGAACATCAAAATCTTCGCTGTTGACCTGGAGGGCTACGGCAA 
GTGTCTGAACCTGGGOGAOGAGTTCAAOGAGAACAACTACATGAAAATCTTCGGCATGAGC 
CACAGCATCCTOAAGTTCATCAGOGCTAAGCAAGGCGGCCCTAACATGGTGGAGGTGATCA 
AGAACTTCGCTCTGCAAAAGATCGGCCAAAAGTGACTGCAGACTAGTCTAGAAAGCTTGGT 
ACOGCCC 
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GGGGCCATGGATGMCCGTCGTAACCAAAAGAAGCCGCAACCTCro 

AACCTGGACXXGGX XCTUCAAAACCTGGAGGTTTACAAGACCGAAATCGAGGACTAGAAGA 

CCCAACAAGAACAAATCAAGGAGGAGGACCTOAAGCTGCTGAAGn^^ 

MACGGttACAGCGGCMCGACGACGAOaACGAGGAGAAGAACAGCJUt^^ 

CTCCTGCGTOGTGTTAACCAAATCAAGCAACAAGTTCAACTOATCAAGAAGGTTGCC^ 

AOCTTGAQAAGCACCTGAACCTOAA06AOGACGA6AACAACAAGAA06GCCTGAGOGAOCA 

ACAMTTAAGOAOOAOCAACTGCGTACCATCACCOAGCAOCAAGTTAACTACGAAAACCTC 

OTTTTCAACATGGACTACCAACTGCACCTGAAOOAGAGCGGCOOCCACCCTOGTCACOGTC 

GC0A0ACOGACTACOACACO6AGAACTGGTTOGAGATCAOCCAO0ACCAAAACAACTACCT 

TAGCATCTACGCTAACCAAAAGACCAGCTACTGTTCGTGGCTGAAGGACTACTTGAACAAG 

AACA ACTAOQACCACCTCAACQTTAGCATCAAOCOTCTGCAGACCQAOCCTCAGTTCTACQ 

CTTTCCACCACTTCACCCAAACCATCAACCTOACCAACAACAOCTACCAAACCOTTAACAT 

CQAOCTCAACTTOQACAACAACCTQTOTATCCTQCCTCTCCTQCGTTTCCl^ 

CAOOOTTTCAACATCCTCAACATOOCTAOCACCTACACCOCTAACCAATACAACTTOOAAA 

AQATaSCCGACCTOCTOOAGACC A TCTTCGCraTT U TTTl^^ 

CATCCACCTOCAAGTTCCOTOTOAOOCTTTCCAATACCTCCTTAACAGCAOCAOCCAAATC 

ACOGTTAAOC ACACCCAACTGCAAOTTTACAOCTTCAC CACCOACCTG AAGCTGGTTGACA 

CCAACAACCTTCAAOACTACTTCAACTTCCTGCAAOACTTCCOCCOTCTCACCCACGTCAC 

CCAACAAGCTATCCCGG TT AGOG CTACCAACGCTG TTGAG AACCTG AACGTTCTCCTCAAG 

A AGQTT AAOCAOCCTAACCTGAACCTCGTTAOCATCCOGAOCCAATTCAACTTOGACTTCT 

ACTTCGTTAACCTGCAACACCTGAAGCTGGAGTTCGCCCTGGAGCCCAACATCCTGACCAA 

GCAAAACCTGCAOAACCTGCTGCTCAGCATCAAGCAAAGCAAC AACCTG AAOTTCCTGCCT 

CTGAACTTCTACACCTACGTTGCTCAAGAOACCAGCOGTAAGCAAATCCTGAAGCAAGCTA 

S?^?^? CAMAACC ^^ C ^ C ^ C ^ GMCC * MA ^^ GACTCCTCMACC ^ 0 ^AOGA 

GACCCOTAGCGAGAGCACCAGCGGCATGAA 

CTGMMACTTCAGCGTTAACCTGGAAGCTACCCAAGAGA 

TCCl^TOCOTAOCACCAACCTOAACAACTTCAAGCTOACCTACAAOT 

OA ??^ CA1WACXCC "^ TMA TCTGAAOAACAT^ 

COT ^AGC» TTAACATCAOCAACCOOC^OGCCAACATCAgCTAOOJU^ 

^CACCACCTTCTACAAQTTCAACCTGACCCTGAACCAAGAGCTCCAACAOGCTAACTAgJlg 

CTTCAAGnAAAOGAATTCCAATTCAACAAOGTTAAGAGCGCTAAGATOG^ 

ATCAACTGTATCC^ACCAACACATCCTGAACAGCATCJMJCCAGTTCCTO 
AGAAGATCAAGGCTTTCATCCTGAAGCGTTACTACCTGCTGCAATACTACCTGGACTACAC 
CAAGWTTC^ACreTGCAACA^ 
CTGCAGGACCTGACCGTTACCGAGGTTCAC3Ui^^ 

^TTCTACCAGCCGCTGTGTGAGTTCATCAAGGAGAGCAGCCAAACCCTGCAACTGATCGA 
CTTCGACCAAAACAWGTTAGCGAOGACAGCATCAAGAAGATCCTCGAGAGCATCAGC3GAG 

AGAACGAGGAGATCCAAGAGCTGCTGAAGGCTTGTGAOGAGAAGGGCGTTCTGGTTAAGGC 
TTACTACAAGTTCCCGCTGTGTCTOCCOACOGGCACCTACTACGACTACAACAGCGACCGT 
TGGTGAGAGCTCCACCOCGGTOGCOGCCGCTCTAOAACTAOTCCCGGGAAGCTTGGGG 
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